From ef3dfb7ea52518e7a0e394feebdc2cca20b186ec Mon Sep 17 00:00:00 2001 From: merge from main Date: Thu, 8 May 2025 16:26:11 +0530 Subject: [PATCH 1/8] Updating Strimzi Kafka, SquidProxy, Varnish docs --- .../containers-orchestration/strimzi-kafka.md | 308 +++++++++++++++--- docs/integrations/web-servers/squid-proxy.md | 202 +++--------- docs/integrations/web-servers/varnish.md | 221 +++---------- 3 files changed, 335 insertions(+), 396 deletions(-) diff --git a/docs/integrations/containers-orchestration/strimzi-kafka.md b/docs/integrations/containers-orchestration/strimzi-kafka.md index 5e65dd3b68..f375ae3d4a 100644 --- a/docs/integrations/containers-orchestration/strimzi-kafka.md +++ b/docs/integrations/containers-orchestration/strimzi-kafka.md @@ -35,7 +35,7 @@ This App has been tested with following Kafka versions: ## Sample queries -This sample query string is from the Logs panel of the **Kafka - Logs** dashboard. +This sample query string is from the Logs panel of the **Strimzi Kafka - Logs** dashboard. ```sql messaging_cluster=* messaging_system="kafka" \ @@ -171,37 +171,9 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su tailing-sidecar: sidecarconfig;data:/opt/Kafka/kafka_/logs/server.log ``` -3. **Configure Fields in Sumo Logic** - - Create the following Fields in Sumo Logic prior to configuring collection. This ensures that your logs and metrics are tagged with relevant metadata, which is required by the app dashboards. For information on setting up fields, see [Sumo Logic Fields](/docs/manage/fields). - - * `pod_labels_component` - * `pod_labels_environment` - * `pod_labels_messaging_system` - * `pod_labels_messaging_cluster` - -4. **Adding FER for normalizing fields** - - Labels created in Kubernetes environments automatically are prefixed with `pod_labels`. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for Messaging Application Components. To do so: - 1. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Logs > Field Extraction Rules**.
[**New UI**](/docs/get-started/sumo-logic-ui). In the top menu select **Configuration**, and then under **Logs** select **Field Extraction Rules**. You can also click the **Go To...** menu at the top of the screen and select **Field Extraction Rules**. - 2. Click the **+ Add** button on the top right of the table. - 3. The **Add Field Extraction Rule** form will appear. Enter the following options: - * **Rule Name**. Enter the name as **App Component Observability - Messaging.** - * **Applied At**. Choose Ingest Time - * **Scope**. Select Specific Data - * Scope: Enter the following keyword search expression: - ```sql - pod_labels_environment=* pod_labels_component=messaging - pod_labels_messaging_system=kafka pod_labels_messaging_cluster=* - ``` - * **Parse Expression**. Enter the following parse expression: - ```sql - if (!isEmpty(pod_labels_environment), pod_labels_environment, "") as environment - | pod_labels_component as component - | pod_labels_messaging_system as messaging_system - | pod_labels_messaging_cluster as messaging_cluster - ``` - 4. Click **Save** to create the rule. +
+**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments automatically are prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilityMessagingStrimziKafkaFER** is automatically created for Strimzi Kafka Application Components. +
Sumo Logic FER @@ -262,36 +234,264 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su ``` -## Installing Kafka Alerts +## Installing the Kafka App -Follow the [instructions](/docs/integrations/containers-orchestration/kafka/#kafka-alerts) to install the monitors. The list of alert can be found [here](/docs/integrations/containers-orchestration/kafka/#kafka-alerts). +import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; + -## Installing the Kafka App +As part of the app installation process, the following fields will be created by default: +* `component` +* `environment` +* `messaging_system` +* `messaging_cluster` +* `pod` + +Additionally, if you're using Squid Proxy in the Kubernetes environment, the following additional fields will be created by default during the app installation process: +* `pod_labels_component` +* `pod_labels_environment` +* `pod_labels_messaging_system` +* `pod_labels_messaging_cluster` -This section demonstrates how to install the Strimzi Kafka App. +## Viewing the Kafka Dashboards -Locate and install the app you need from the **App Catalog**. If you want to see a preview of the dashboards included with the app before installing, click **Preview Dashboards**. +import ViewDashboards from '../../reuse/apps/view-dashboards.md'; -1. From the **App Catalog**, search for and select the app. -2. Select the version of the service you're using and click **Add to Library**. - :::note - Version selection is not available for all apps. - ::: -3. To install the app, complete the following fields. - * **App Name.** You can retain the existing name, or enter a name of your choice for the app. - * **Data Source.** Choose **Enter a Custom Data Filter**, and enter a custom Kafka cluster filter. Examples: - * For all Kafka clusters `messaging_cluster=*` - * For a specific cluster: `messaging_cluster=Kafka.dev.01`. This should be the same as `<>` value provided while defining annotations and labels. - * Clusters within a specific environment: `messaging_cluster=Kafka-1 and environment=prod`. This assumes you have set the optional environment tag while configuring collection. This should be same as `<>` and `<>` values provided while defining annotations and labels. -4. **Advanced**. Select the **Location in Library** (the default is the Personal folder in the library), or click **New Folder** to add a new folder. -5. Click **Add to Library**. + -When an app is installed, it will appear in your **Personal** folder, or another folder that you specified. From here, you can share it with your organization. +### Strimzi Kafka - Cluster Overview -Panels will start to fill automatically. It's important to note that each panel slowly fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but with a bit of time, you'll see full graphs and maps. +The **Strimzi Kafka - Cluster Overview** dashboard gives you an at-a-glance view of your Kafka deployment across brokers, controllers, topics, partitions and zookeepers. +Use this dashboard to: +* Identify when brokers don’t have active controllers +* Analyze trends across Request Handler Idle percentage metrics. Kafka’s request handler threads are responsible for servicing client requests ( read/write disk). If the request handler threads get overloaded, the time taken for requests to complete will be longer. If the request handler idle percent is constantly below 0.2 (20%), it may indicate that your cluster is overloaded and requires more resources. +* Determine the number of leaders, partitions and zookeepers across each cluster and ensure they match with expectations -## Viewing the Kafka Dashboards +Kafka dashboards + + +### Strimzi Kafka - Outlier Analysis + +The **Strimzi Kafka - Outlier Analysis** dashboard helps you identify outliers for key metrics across your Kafka clusters. + +Use this dashboard to: +* To analyze trends, and quickly discover outliers across key metrics of your Kafka clusters + +Kafka dashboards + + +### Strimzi Kafka - Replication + +The Strimzi Kafka - Replication dashboard helps you understand the state of replicas in your Kafka clusters. + +Use this dashboard to monitor the following key metrics: +* In-Sync Replicas (ISR) Expand Rate - The ISR Expand Rate metric displays the one-minute rate of increases in the number of In-Sync Replicas (ISR). ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker.The expected value for this rate is normally zero. +* In-Sync Replicas (ISR) Shrink Rate - The ISR Shrink Rate metric displays the one-minute rate of decreases in the number of In-Sync Replicas (ISR). ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker.The expected value for this rate is normally zero. + * ISR Shrink Vs Expand Rate - If you see a Spike in ISR Shrink followed by ISR Expand Rate - this may be because of nodes that have fallen behind replication and they may have either recovered or are in the process of recovering now. + * Failed ISR Updates + * Under Replicated Partitions Count + * Under Min ISR Partitions Count -The Under Min ISR Partitions metric displays the number of partitions, where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified. The two most common causes of under-min ISR partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers are falling behind. +* The expected value for this rate is normally zero. + +Kafka dashboards + + +### Strimzi Kafka - Zookeeper + +The **Strimzi Kafka -Zookeeper** dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput and network across Kafka brokers and clusters. + +Use this dashboard to monitor key Zookeeper metrics such as: +* **Zookeeper disconnect rate** - This metric indicates if a Zookeeper node has lostits connection to a Kafka broker. +* **Authentication Failures** - This metric indicates a Kafka Broker is unable to connect to its Zookeeper node. +* **Session Expiration** - When a Kafka broker - Zookeeper node session expires, leader changes can occur and the broker can be assigned a new controller. If this metric is increasing we recommend you: + 1. Check the health of your network. + 2. Check for garbage collection issues and tune your JVMs accordingly. +* Connection Rate. + +Kafka dashboards + +### Strimzi Kafka - Broker + +The Strimzi Kafka - Broker dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput, and network across Kafka brokers and clusters. + +Use this dashboard to: +* Monitor Under Replicaed and offline partitions to quickly identify if a Kafka broker is down or over utilized. +* Monitor Unclean Leader Election count metrics - this metric shows the number of failures to elect a suitable leader per second. Unclean leader elections are caused when there are no available in-sync replicas for a partition (either due to network issues, lag causing the broker to fall behind, or brokers going down completely), so an out of sync replica is the only option for the leader. When an out of sync replica is elected leader, all data not replicated from the previous leader is lost forever. +* Monitor producer and fetch request rates. +* Monitor Log flush rate to determine the rate at which log data is written to disk + +Kafka dashboards + + +### Strimzi Kafka - Failures and Delayed Operations + +The **Strimzi Kafka - Failures and Delayed Operations** dashboard gives you insight into all failures and delayed operations associated with your Kafka clusters. + +Use this dashboard to: +* Analyze failed produce requests - A failed produce request occurs when a problem is encountered when processing a produce request. This could be for a variety of reasons, however some common reasons are: + * The destination topic doesn’t exist (if auto-create is enabled then subsequent messages should be sent successfully). + * The message is too large. + * The producer is using _request.required.acks=all_ or –_1_, and fewer than the required number of acknowledgements are received. +* Analyze failed Fetch Request - A failed fetch request occurs when a problem is encountered when processing a fetch request. This could be for a variety of reasons, but the most common cause is consumer requests timing out. +* Monitor delayed Operations metrics - This contains metrics regarding the number of requests that are delayed and waiting in purgatory. The purgatory size metric can be used to determine the root cause of latency. For example, increased consumer fetch times could be explained by an increased number of fetch requests waiting in purgatory. Available metrics are: + * Fetch Purgatory Size - The Fetch Purgatory Size metric shows the number of fetch requests currently waiting in purgatory. Fetch requests are added to purgatory if there is not enough data to fulfil the request (determined by fetch.min.bytes in the consumer configuration) and the requests wait in purgatory until the time specified by fetch.wait.max.ms is reached, or enough data becomes available. + * Produce Purgatory Size - The Produce Purgatory Size metric shows the number of produce requests currently waiting in purgatory. Produce requests are added to purgatory if request.required.acks is set to -1 or all, and the requests wait in purgatory until the partition leader receives an acknowledgement from all its followers. If the purgatory size metric keeps growing, some partition replicas may be overloaded. If this is the case, you can choose to increase the capacity of your cluster, or decrease the amount of produce requests being generated. + +Kafka dashboards + + +### Strimzi Kafka - Request-Response Times + +The **Strimzi Kafka - Request-Response** **Times** dashboard helps you get insight into key request and response latencies of your Kafka cluster. + +Use this dashboard to: +* Monitor request time metrics - The Request Metrics metric group contains information regarding different types of request to and from the cluster. Important request metrics to monitor: + 1. **Fetch Consumer Request Total Time** - The Fetch Consumer Request Total Time metric shows the maximum and mean amount of time taken for processing, and the number of requests from consumers to get new data. Reasons for increased time taken could be: increased load on the node (creating processing delays), or perhaps requests are being held in purgatory for a long time (determined by fetch.min.bytes and fetch.wait.max.ms metrics). + 2. **Fetch Follower Request Total Time** - The Fetch Follower Request Total Time metric displays the maximum and mean amount of time taken while processing, and the number of requests to get new data from Kafka brokers that are followers of a partition. Common causes of increased time taken are increased load on the node causing delays in processing requests, or that some partition replicas may be overloaded or temporarily unavailable. + 3. **Produce Request Total Time**- The Produce Request Total Time metric displays the maximum and mean amount of time taken for processing, and the number of requests from producers to send data. Some reasons for increased time taken could be: increased load on the node causing delays in processing the requests, or perhaps requests are being held in purgatory for a long time (if the `requests.required.acks` metrics is equal to '1' or all). + +Kafka dashboards + +### Strimzi Kafka - Logs + +This dashboard helps you quickly analyze your Kafka error logs across all clusters. + +Use this dashboard to: +* Identify critical events in your Kafka broker and controller logs; +* Examine trends to detect spikes in Error or Fatal events +* Monitor Broker added/started and shutdown events in your cluster. +* Quickly determine patterns across all logs in a given Kafka cluster. + +Kafka dashboards + +### Kafka Broker - Performance Overview + +The **Kafka Broker - Performance Overview** dashboards helps you Get an at-a-glance view of the performance and resource utilization of your Kafka brokers and their JVMs. + +Use this dashboard to: +* Monitor the number of open file descriptors. If the number of open file descriptors reaches the maximum file descriptor, it can cause an IOException error +* Get insight into Garbage collection and its impact on CPU usage and memory +* Examine how threads are distributed +* Understand the behavior of class count. If class count keeps on increasing, you may have a problem with the same classes loaded by multiple classloaders. + +Kafka dashboards + +### Kafka Broker - CPU + +The **Kafka Broker - CPU** dashboard shows information about the CPU utilization of individual Broker machines. + +Use this dashboard to: +* Get insights into the process and user CPU load of Kafka brokers. High CPU utilization can make Kafka flaky and can cause read/write timeouts. + +Kafka dashboards + +### Kafka Broker - Memory + +The **Kafka Broker - Memory** dashboard shows the percentage of the heap and non-heap memory used, physical and swap memory usage of your Kafka broker’s JVM. + +Use this dashboard to: +* Understand how memory is used across Heap and Non-Heap memory. +* Examine physical and swap memory usage and make resource adjustments as needed. +* Examine the pending object finalization count which when high can lead to excessive memory usage. + +Kafka dashboards + + +### Kafka Broker - Disk Usage + +The **Kafka Broker - Disk Usage** dashboard helps monitor disk usage across your Kafka Brokers. + +Use this dashboard to: +* Monitor Disk Usage percentage on Kafka Brokers. This is critical as Kafka brokers use disk space to store messages for each topic. Other factors that affect disk utilization are: + 1. Topic replication factor of Kafka topics. + 2. Log retention settings. +* Analyze trends in disk throughput and find any spikes. This is especially important as disk throughput can be a performance bottleneck. +* Monitor iNodes bytes used, and disk read vs writes. These metrics are important to monitor as Kafka may not necessarily distribute data from a heavily occupied disk, which itself can bring the Kafka down. + +Kafka dashboards + +### Kafka Broker - Garbage Collection + +The **Kafka Broker - Garbage Collection** dashboard shows key Garbage Collector statistics like the duration of the last GC run, objects collected, threads used, and memory cleared in the last GC run of your java virtual machine. + +Use this dashboard to: +* Understand the amount of time spent in garbage collection. If this time keeps increasing, your Kafka brokers may have more CPU usage. +* Understand the amount of memory cleared by garbage collectors across memory pools and their impact on the Heap memory. + +Kafka dashboards + + +### Kafka Broker - Threads + +The **Kafka Broker - Threads** dashboard shows the key insights into the usage and type of threads created in your Kafka broker JVM + +Use this dashboard to: +* Understand the dynamic behavior of the system using peak, daemon, and current threads. +* Gain insights into the memory and CPU time of the last executed thread. + +Kafka dashboards + +### Kafka Broker - Class Loading and Compilation + +The **Kafka Broker - Class Loading and Compilation** dashboard helps you get insights into the behavior of class count trends. + +Use this dashboard to: + +* Determine If the class count keeps increasing, this indicates that the same classes are loaded by multiple classloaders. +* Get insights into time spent by Java Virtual machines during compilation. + +Kafka dashboards + + +### Strimzi Kafka - Topic Overview + +The Strimzi Kafka - Topic Overview dashboard helps you quickly identify under-replicated partitions, and incoming bytes by Kafka topic, server and cluster. + +Use this dashboard to: + +* Monitor under replicated partitions - The Under Replicated Partitions metric displays the number of partitions that do not have enough replicas to meet the desired replication factor. A partition will also be considered under-replicated if the correct number of replicas exist, but one or more of the replicas have fallen significantly behind the partition leader. The two most common causes of under-replicated partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers have fallen behind. + + This metric is tagged with cluster, server, and topic info for easy troubleshooting. The colors in the Honeycomb chart are coded as follows: + +1. Green indicates there are no under Replicated Partitions. +2. Red indicates a given partition is under replicated. + +Kafka dashboards + + + +### Strimzi Kafka - Topic Details + +The Strimzi Kafka - Topic Details dashboard gives you insight into throughput, partition sizes and offsets across Kafka brokers, topics and clusters. + +Use this dashboard to: +* Monitor metrics like Log partition size, log start offset, and log segment count metrics. +* Identify offline/under replicated partitions count. Partitions can be in this state on account of resource shortages or broker unavailability. +* Monitor the In Sync replica (ISR) Shrink rate. ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker. +* Monitor In Sync replica (ISR) Expand rate. ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker. + +Kafka dashboards + +## Create monitors for Strimzi Kafka app + +import CreateMonitors from '../../reuse/apps/create-monitors.md'; + + + +### Strimzi Kafka alerts + +| Alert Name | Alert Description and conditions | Alert Condition | Recover Condition | +|:---------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|:-------------------| +| Strimzi Kafka - High Broker Disk Utilization | This alert fires when we detect that a disk on a broker node is more than 85% full. | `>=`85 | < 85 | +| Strimzi Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures. | | | +| Strimzi Kafka - High Leader election rate | This alert fires when we detect high leader election rate. | | | +| Strimzi Kafka - Garbage collection | This alert fires when we detect that the average Garbage Collection time on a given Kafka broker node over a 5 minute interval is more than one second. | > = 1 | < 1 | +| Strimzi Kafka - Offline Partitions | This alert fires when we detect offline partitions on a given Kafka broker. | | | +| Strimzi Kafka - Fatal Event on Broker | This alert fires when we detect a fatal operation on a Kafka broker node | `>=`1 | `<`1 | +| Strimzi Kafka - Underreplicated Partitions | This alert fires when we detect underreplicated partitions on a given Kafka broker. | | | +| Strimzi Kafka - Large number of broker errors | This alert fires when we detect that there are 5 or more errors on a Broker node within a time interval of 5 minutes. | | | +| Strimzi Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for an interval of 5 minutes. | | | +| Strimzi Kafka - Out of Sync Followers | This alert fires when we detect that there are Out of Sync Followers within a time interval of 5 minutes. | | | +| Strimzi Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5 minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | -The dashboards are identical to Kafka and their use cases can be found [here](/docs/integrations/containers-orchestration/kafka/#viewing-the-kafka-dashboards). diff --git a/docs/integrations/web-servers/squid-proxy.md b/docs/integrations/web-servers/squid-proxy.md index a7ccdd5e28..4ac4913014 100644 --- a/docs/integrations/web-servers/squid-proxy.md +++ b/docs/integrations/web-servers/squid-proxy.md @@ -22,40 +22,8 @@ This app is tested with the following Squid Proxy versions: This section provides instructions for configuring log and metric collection for the Sumo Logic app for Squid Proxy. -### Step 1: Configure Fields in Sumo Logic -Create the following fields in Sumo Logic prior to configuring the collection. This ensures that your logs and metrics are tagged with relevant metadata, which is required by the app dashboards. For information on setting up fields, see [Sumo Logic Fields](/docs/manage/fields). - - - - - -If you're using Squid Proxy in a Kubernetes environment, create the fields: -* `pod_labels_component` -* `pod_labels_environment` -* `pod_labels_proxy_system` -* `pod_labels_proxy_cluster` - - - - -If you're using Squid Proxy in a non-Kubernetes environment, create the fields: -* `component` -* `environment` -* `proxy_system` -* `proxy_cluster` -* `pod` - - - - -### Step 2: Configure Logs and Metrics Collection for Squid Proxy +### Configure Logs and Metrics Collection for Squid Proxy Sumo Logic supports the collection of logs and metrics data from Squid Proxy in both Kubernetes and non-Kubernetes environments. @@ -85,7 +53,7 @@ In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collec It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). ::: -#### Configure Metrics Collection +### Configure Metrics Collection This section explains the steps to collect Squid Proxy metrics from a Kubernetes environment. @@ -297,7 +265,7 @@ Enter in values for the following parameters (marked `CHANGEME` in the snippet a 4. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step. 5. Verify metrics in Sumo Logic. -#### Configure Logs Collection +### Configure Logs Collection This section explains the steps to collect Squid Proxy logs from a Kubernetes environment. @@ -341,10 +309,11 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data ``` 5. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above. 6. Verify logs in Sumo Logic. -3. **Add an FER to normalize the fields in Kubernetes environments** Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for Proxy Application Components. To do so: - 1. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Logs > Field Extraction Rules**.
[**New UI**](/docs/get-started/sumo-logic-ui). In the top menu select **Configuration**, and then under **Logs** select **Field Extraction Rules**. You can also click the **Go To...** menu at the top of the screen and select **Field Extraction Rules**. - 2. Click the + Add button on the top right of the table. - 3. The **Add Field Extraction Rule** form will appear. + +
+**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments automatically are prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilitySquidProxyFER** is automatically created for Squid Proxy Application Components. +
+ @@ -355,7 +324,7 @@ Sumo Logic uses the Telegraf operator for Squid Proxy metric collection and the The process to set up collection for Squid Proxy data is done through the following steps. -#### Configure Logs Collection +### Configure Logs Collection Squid Proxy app supports the default access logs and cache logs format. @@ -421,7 +390,7 @@ If you're using a service like Fluentd, or you would like to upload your logs ma -#### Configure Metrics Collection +### Configure Metrics Collection 1. **Set up a Sumo Logic HTTP Source**. 1. Configure a Hosted Collector for Metrics. To create a new Sumo Logic hosted collector, perform the steps in the [Create a Hosted Collector](/docs/send-data/hosted-collectors/configure-hosted-collector.md) documentation. @@ -653,133 +622,36 @@ If you're using a service like Fluentd, or you would like to upload your logs ma -## Installing Squid Proxy Monitors - -This section and below provide instructions for installing the Squid Proxy app, as well as examples of each of the app dashboards. These instructions assume you have already set up the collection as described above. - -* To install these alerts, you need to have the Manage Monitors role capability. -* Alerts can be installed by either importing a JSON file or a Terraform script. - -#### Pre-Packaged Alerts - -Sumo Logic has provided out-of-the-box alerts available through [Sumo Logic monitors](/docs/alerts/monitors) to help you monitor your Squid Proxy farms. These alerts are built based on metrics and logs datasets and include preset thresholds based on industry best practices and recommendations. - -For details on alerts, see [Alerts](#squid-proxy-alerts). - -There are limits to how many alerts can be enabled - see the [Alerts FAQ](/docs/alerts/monitors/monitor-faq.md). - -### Method A: Importing a JSON file - -1. Download the [JSON file](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/blob/main/monitor_packages/SquidProxy/squidproxy.json) that describes the monitors. -2. The [JSON](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/blob/main/monitor_packages/SquidProxy/squidproxy.json) contains the alerts that are based on Sumo Logic searches that do not have any scope filters and therefore will be applicable to all Squid Proxy clusters, the data for which has been collected via the instructions in the previous sections. However, if you would like to restrict these alerts to specific farms or environments, update the JSON file by replacing the text `proxy_system=squidproxy` with ``. - -Custom filter examples: - -1. For alerts applicable only to a specific farm, your custom filter would be ‘`proxy_cluster=squidproxy-standalone.01`‘. -2. For alerts applicable to all cluster that start with squidproxy-standalone, your custom filter would be '`proxy_cluster=squidproxy-standalone*`'. -3. For alerts applicable to a specific farm within a production environment, your custom filter would be `proxy_cluster=squidproxy-1` and `environment=standalone` (This assumes you have set the optional environment tag while configuring collection). -4. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Monitoring > Monitors**.
[**New UI**](/docs/get-started/sumo-logic-ui). In the main Sumo Logic menu, select **Alerts > Monitors**. You can also click the **Go To...** menu at the top of the screen and select **Monitors**. -5. Click **Add**. -6. Click Import and then copy-paste the above JSON to import monitors. - -The monitors are disabled by default. Once you have installed the alerts using this method, navigate to the Squid Proxy folder under **Monitors** to configure them. See [this](/docs/alerts/monitors) document to enable monitors to send notifications to teams or connections. See the instructions detailed in Step 4 of this [document](/docs/alerts/monitors/create-monitor). - - -### Method 2: Install the alerts using a Terraform script - -1. Generate a Sumo Logic access key and access ID for a user that has the Manage Monitors role capability in Sumo Logic using instructions in [Access Keys](/docs/manage/security/access-keys). Identify which deployment your Sumo Logic account is in, using this [link](/docs/api/getting-started#sumo-logic-endpoints-by-deployment-and-firewall-security). -2. [Download and install Terraform 0.13](https://www.terraform.io/downloads.html) or later. -3. Download the Sumo Logic Terraform package for Squid Proxy alerts: The alerts package is available in the Sumo Logic GitHub [repository](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/tree/main/monitor_packages/SquidProxy). You can either download it through the “git clone” command or as a zip file. -4. Alert Configuration: After the package has been extracted, navigate to the package directory terraform-sumologic-sumo-logic-monitor/monitor_packages/SquidProxy/. - 1. Edit the **squidproxy.auto.tfvars** file and add the Sumo Logic Access Key, Access Id and Deployment from Step 1. - ```sql - access_id = "" - access_key = "" - environment = "" - ``` - 2. The Terraform script installs the alerts without any scope filters, if you would like to restrict the alerts to specific farms or environments, update the variable `squidproxy_data_source`. Custom filter examples: - * A specific cluster `squidproxy_cluster=squidproxy.standalone.01`. - * All clusters in an environment `environment=standalone`. - * For alerts applicable to all cluster that start with squidproxy-standalone, your custom filter would be `proxy_cluster=squidproxy-standalone*`. - * For alerts applicable to a specific farm within a production environment, your custom filter would be `proxy_system=squidproxy` and `environment=standalone` (This assumes you have set the optional environment tag while configuring collection). - 3. All monitors are disabled by default on installation, if you would like to enable all the monitors, set the parameter `monitors_disabled` to `false` in this file. - 4. By default, the monitors are configured in a monitor **folder** called “**SquidProxy**”, if you would like to change the name of the folder, update the monitor folder name in “folder” key at `squidproxy.auto.tfvars` file. -5. If you would like the alerts to send email or connection notifications, modify the file `squidproxy_notifications.auto.tfvars` and populate `connection_notifications` and `email_notifications` as per below examples. - -```bash title="Pagerduty Connection Example" -connection_notifications = [ - { - connection_type = "PagerDuty", - connection_id = "", - payload_override = "{\"service_key\": \"your_pagerduty_api_integration_key\",\"event_type\": \"trigger\",\"description\": \"Alert: Triggered {{TriggerType}} for Monitor {{Name}}\",\"client\": \"Sumo Logic\",\"client_url\": \"{{QueryUrl}}\"}", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - }, - { - connection_type = "Webhook", - connection_id = "", - payload_override = "", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - } - ] -``` -Replace `` with the connection id of the webhook connection. The webhook connection id can be retrieved by calling the [Monitors API](https://api.sumologic.com/docs/#operation/listConnections). - -For overriding payload for different connection types, refer to this [document](/docs/alerts/webhook-connections/set-up-webhook-connections). - -```bash title="Email Notifications Example" -email_notifications = [ - { - connection_type = "Email", - recipients = ["abc@example.com"], - subject = "Monitor Alert: {{TriggerType}} on {{Name}}", - time_zone = "PST", - message_body = "Triggered {{TriggerType}} Alert on {{Name}}: {{QueryURL}}", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - } - ] -``` -6. Install the Alerts: - 1. Navigate to the package directory terraform-sumologic-sumo-logic-monitor/monitor_packages/**SquidProxy**/ and run `terraform init`. This will initialize Terraform and will download the required components. - 2. Run `terraform plan` to view the monitors which will be created/modified by Terraform. - 3. Run **`terraform apply`**. -7. Post Installation: If you haven’t enabled alerts and/or configured notifications through the Terraform procedure outlined above, we highly recommend enabling alerts of interest and configuring each enabled alert to send notifications to other users or services. This is detailed in [this document](/docs/alerts/monitors/create-monitor). - -There are limits to how many alerts can be enabled - see the [Alerts FAQ](/docs/alerts/monitors/monitor-faq.md). ## Installing the Squid Proxy app -This section demonstrates how to install the Squid Proxy app. - -Locate and install the app you need from the **App Catalog**. If you want to see a preview of the dashboards included with the app before installing, click **Preview Dashboards**. +import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; -1. From the **App Catalog**, search for and select the app. -2. Select the version of the service you're using and click **Add to Library**. -:::note -Version selection is not available for all apps. -::: -3. To install the app, complete the following fields. - * **App Name.** You can retain the existing name, or enter a name of your choice for the app. - * **Data Source.** Choose **Enter a Custom Data Filter**, and enter a custom Squid Proxy cluster filter. Examples: - * For all Squid Proxy clusters: `proxy_cluster=*` - * For a specific farm; `proxy_cluster=squidproxy.dev.01`. - * Clusters within a specific environment: `proxy_cluster=squidproxy.dev.01` and `environment=prod`. This assumes you have set the optional environment tag while configuring collection. -3. **Advanced**. Select the **Location in Library** (the default is the Personal folder in the library), or click **New Folder** to add a new folder. -4. Click **Add to Library**. + -Once an app is installed, it will appear in your **Personal** folder, or other folder that you specified. From here, you can share it with your organization. +As part of the app installation process, the following fields will be created by default: +* `component` +* `environment` +* `proxy_system` +* `proxy_cluster` +* `pod` -Panels will start to fill automatically. It's important to note that each panel slowly fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but with a bit of time, you'll see full graphs and maps. +Additionally, if you're using Squid Proxy in the Kubernetes environment, the following additional fields will be created by default during the app installation process: +* `pod_labels_component` +* `pod_labels_environment` +* `pod_labels_proxy_system` +* `pod_labels_proxy_cluster` ## Viewing the Squid Proxy Dashboards -:::tip Filter with template variables -Template variables provide dynamic dashboards that can rescope data on the fly. As you apply variables to troubleshoot through your dashboard, you view dynamic changes to the data for a quicker resolution to the root cause. You can use template variables to drill down and examine the data on a granular level. For more information, see [Filter with template variables](/docs/dashboards/filter-template-variables.md). -::: +import ViewDashboards from '../../reuse/apps/view-dashboards.md'; + + ### Overview -The **Squid Proxy - Overview** dashboard provides an at-a-glance view of the activity and health of the SquidProxy clusters and servers by monitoring uptime, number of current clients, latency, bandwidth, destination locations, error and denied requests, URLs accessed. +The **Squid Proxy (Classic) - Overview** dashboard provides an at-a-glance view of the activity and health of the SquidProxy clusters and servers by monitoring uptime, number of current clients, latency, bandwidth, destination locations, error and denied requests, URLs accessed. Use this dashboard to: * Gain insights into information about the destination location your intranet frequently visits by region. @@ -792,7 +664,7 @@ Use this dashboard to: ### Protocol -The **Squid Proxy - Protocol** dashboard provides an insight into the protocols of clusters: the number of HTTP requests, HTTP errors, total bytes transferred, the number of HTTP requests per second, the number of HTTP's bytes per second. +The **Squid Proxy (Classic) - Protocol** dashboard provides an insight into the protocols of clusters: the number of HTTP requests, HTTP errors, total bytes transferred, the number of HTTP requests per second, the number of HTTP's bytes per second. Use this dashboard to: * Get detailed information about the total number of requests from clients, the total number of HTTP errors sent to clients, the total number of bytes transferred on servers, total number of bytes sent to clients @@ -803,7 +675,7 @@ Use this dashboard to: ### Performance -The **Squid Proxy - Performance** dashboard provides an insight into the workload of clusters, the number of page faults IO, percent of file descriptor used, number of memory used, the time for all HTTP requests, the number of objects in the cache, the CPU time. +The **Squid Proxy (Classic) - Performance** dashboard provides an insight into the workload of clusters, the number of page faults IO, percent of file descriptor used, number of memory used, the time for all HTTP requests, the number of objects in the cache, the CPU time. Use this dashboard to: * Gain insights into the workload of squid proxy servers such as percent of file descriptors used, memory usage, CPU time consumed. @@ -814,7 +686,7 @@ Use this dashboard to: ### IP Domain DNS Statistics -The **Squid Proxy - IP Domain DNS Statistics** dashboard provides a high-level view of the number of IPs, the number of FQDN, rate requests cache according to FQDN, rate requests cache according to IPs, the number of DNS queries, time for DNS query. +The **Squid Proxy (Classic) - IP Domain DNS Statistics** dashboard provides a high-level view of the number of IPs, the number of FQDN, rate requests cache according to FQDN, rate requests cache according to IPs, the number of DNS queries, time for DNS query. Use this dashboard to: * Gain insights into IPs accessed statistics: IP Cache Entries, Number and rate of IP Cache requests, Number and rate of IP Cache hits. @@ -825,7 +697,7 @@ Use this dashboard to: ### Activity Trend -The **Squid Proxy - Activity Trend** dashboard provides trends around denied request trend, action trend, time spent to serve, success and non-success response, remote hosts. +The **Squid Proxy (Classic) - Activity Trend** dashboard provides trends around denied request trend, action trend, time spent to serve, success and non-success response, remote hosts. Use this dashboard to: * Gain insights into the average amount of time it takes to serve a request and the kind of method the request was. @@ -837,7 +709,7 @@ Use this dashboard to: ### HTTP Response Analysis -The **Squid Proxy - HTTP Response Analysis** dashboard provides insights into HTTP response, HTTP code, the number of client errors, server errors, redirections outlier, URLs experiencing server errors. +The **Squid Proxy (Classic) - HTTP Response Analysis** dashboard provides insights into HTTP response, HTTP code, the number of client errors, server errors, redirections outlier, URLs experiencing server errors. Use this dashboard to: * Gain insights into the count of HTTP responses, such as redirections, successes, client errors, or server errors, on an area chart. @@ -849,7 +721,7 @@ Use this dashboard to: ### Quality of Service -The **Squid Proxy - Quality of Service** dashboard provides insights into latency, the response time of requests according to HTTP action, and the response time according to location. +The **Squid Proxy (Classic) - Quality of Service** dashboard provides insights into latency, the response time of requests according to HTTP action, and the response time according to location. Use this dashboard to: * To identify locations with slow average request response times. @@ -858,11 +730,13 @@ Use this dashboard to: Squid Proxy -## Squid Proxy Alerts +## Create monitors for Squid Proxy app + +import CreateMonitors from '../../reuse/apps/create-monitors.md'; -Sumo Logic has provided out-of-the-box alerts available through [Sumo Logic monitors](/docs/alerts/monitors) to help you quickly determine if the Squid Proxy servers are available and performing as expected. These alerts are built based on logs and metrics datasets and have preset thresholds based on industry best practices and recommendations. + -**Sumo Logic provides the following out-of-the-box alerts**: +### Squid Proxy Alerts | Alert Type (Metrics/Logs) | Alert Name | Alert Description | Trigger Type (Critical / Warning) | Alert Condition | Recover Condition | |:---|:---|:---|:---|:---|:---| diff --git a/docs/integrations/web-servers/varnish.md b/docs/integrations/web-servers/varnish.md index 83becc50a2..d811713cd4 100644 --- a/docs/integrations/web-servers/varnish.md +++ b/docs/integrations/web-servers/varnish.md @@ -56,42 +56,7 @@ This section provides instructions for configuring log and metric collection for Configuring log and metric collection for the Varnish app includes the following tasks: -### Step 1: Configure Fields in Sumo Logic - -Create the following Fields in Sumo Logic before configuring the collection. This ensures that your logs and metrics are tagged with relevant metadata, which the app dashboards require. For information on setting up fields, see [Sumo Logic Fields](/docs/manage/fields). - - - - - -If you're using Varnish in a Kubernetes environment, create the fields: - -* `pod_labels_component` -* `pod_labels_environment` -* `pod_labels_cache_system` -* `pod_labels_cache_cluster` - - - - -If you're using Varnish in a non-Kubernetes environment, create the fields: -* `component` -* `environment` -* `cache_system` -* `cache_cluster` -* `pod` - - - - - -### Step 2: Configure Logs and Metrics Collection +### Configure Logs and Metrics Collection Instructions below show how to configure Kubernetes and Non-Kubernetes environments. @@ -121,7 +86,7 @@ In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collec It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). ::: -#### Configure Metrics Collection +### Configure Metrics Collection This section explains the steps to collect Varnish metrics from a Kubernetes environment. @@ -170,7 +135,7 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data 4. Verify metrics in Sumo Logic. -#### Configure Logs Collection +### Configure Logs Collection This section explains the steps to collect Varnish logs from a Kubernetes environment. @@ -210,26 +175,9 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data ``` 4. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above. 5. Verify logs in Sumo Logic. -3. **Add an FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for WebServer Application Components. To do so: - 1. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Logs > Field Extraction Rules**.
[**New UI**](/docs/get-started/sumo-logic-ui). In the top menu select **Configuration**, and then under **Logs** select **Field Extraction Rules**. You can also click the **Go To...** menu at the top of the screen and select **Field Extraction Rules**. - 2. Click the + Add button on the top right of the table. - 3. The **Add Field Extraction Rule** form will appear: - 4. Enter the following options: - * **Rule Name**. Enter the name as **App Observability - Cache**. - * **Applied At.** Choose **Ingest Time** - * **Scope**. Select **Specific Data** - * **Scope**: Enter the following keyword search expression: - ```sql - pod_labels_environment=* pod_labels_component=cache pod_labels_cache_cluster=* pod_labels_cache_cluster= - ``` - * **Parse Expression**.Enter the following parse expression: - ```sql - if (!isEmpty(pod_labels_environment), pod_labels_environment, "") as environment - | pod_labels_component as component - | pod_labels_cache_system as cache_system - | pod_labels_cache_cluster as cache_cluste - ``` - 5. Click **Save** to create the rule. + +
**FER to normalize the fields in Kubernetes environments.** Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilityVarnishFER** is automatically created for Varnish Application Components. +
@@ -240,7 +188,7 @@ We use the Telegraf operator for Varnish metric collection and Sumo Logic Instal Telegraf runs on the same system as Varnish, and uses the [Varnish input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) to obtain Varnish metrics, and the Sumo Logic output plugin to send the metrics to Sumo Logic. Logs from Varnish on the other hand are sent to a Sumo Logic Local File source. -#### Configure Metrics Collection +### Configure Metrics Collection This section provides instructions for configuring metrics collection for the Sumo Logic app for Varnish. @@ -287,7 +235,7 @@ Once you have finalized your `telegraf.conf` file, you can start or reload the t At this point, Varnish metrics should start flowing into Sumo Logic. -#### Configure Logs Collection +### Configure Logs Collection This section provides instructions for configuring log collection for Varnish running on a non-Kubernetes environment for the Sumo Logic app for Varnish. @@ -327,124 +275,36 @@ At this point, Varnish logs should start flowing into Sumo Logic. -## Installing Varnish Monitors - -Sumo Logic has provided pre-packaged alerts available through [Sumo Logic monitors](/docs/alerts/monitors) to help you proactively determine if a Varnish cluster is available and performing as expected. These monitors are based on metric and log data and include pre-set thresholds that reflect industry best practices and recommendations. For more information about individual alerts, see [Varnish Alerts](#varnish-alerts). - -To install these monitors, you must have the **Manage Monitors** role capability. - -There are limits to how many alerts can be enabled. For more information, see [Monitors](/docs/alerts/monitors/create-monitor) for details. - -You can install monitors by importing a JSON file or using a Terraform script. - -### Method A: Importing a JSON file - -1. Download the [JSON file](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/blob/main/monitor_packages/RabbitMQ/rabbitmq.json) that describes the monitors. -2. The [JSON](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/blob/main/monitor_packages/RabbitMQ/rabbitmq.json) contains the alerts based on Sumo Logic searches that do not have any scope filters. Therefore, it will apply to all Varnish clusters, the data for which has been collected via the instructions in the previous sections. However, if you would like to restrict these alerts to specific clusters or environments, update the JSON file by replacing the text `cache_cluster=*` with ``. Custom filter examples: - * For alerts applicable only to a specific cluster, your custom filter would be `cache_cluster=dev-varnish01` - * For alerts applicable to all clusters that start with `varnish`-prod, your custom filter would be `cache_cluster=varnish-prod*` - * For alerts applicable to a specific cluster within a production environment, your custom filter would be `cache_cluster=dev-varnish01` AND `environment=prod`. This assumes you have set the optional environment tag while configuring collection. -3. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Monitoring > Monitors**.
[**New UI**](/docs/get-started/sumo-logic-ui). In the main Sumo Logic menu, select **Alerts > Monitors**. You can also click the **Go To...** menu at the top of the screen and select **Monitors**. -4. Click **Add**. -5. Click **Import.** -6. On the **Import Content** popup, enter **Varnish** in the **Name** field, paste the JSON into the popup, and click **Import**. -7. The monitors are created in a "Varnish" folder. The monitors are disabled by default. See the [Monitors](/docs/alerts/monitors) topic for information about enabling monitors and configuring notifications or connections. - -### Method B: Using a Terraform script - -1. Generate an access key and access ID for a user with the **Manage Monitors** role capability; for instructions, see [Access Keys](/docs/manage/security/access-keys). -2. Download [Terraform 0.13](https://www.terraform.io/downloads.html) or later and install it. -3. Download the Sumo Logic Terraform package for MySQL monitor. The alerts package is available in the Sumo Logic GitHub [repository](https://github.com/SumoLogic/terraform-sumologic-sumo-logic-monitor/tree/main/monitor_packages). You can either download it using the git clone command or as a zip file. -4. Alert Configuration. After extracting the package, navigate to the `terraform-sumologic-sumo-logic-monitor/monitor_packages/Varnish/` directory. - -Edit the `varnish.auto.tfvars` file and add the Sumo Logic Access Key and Access ID from Step 1 and your Sumo Logic deployment. If you're not sure of your deployment, see [Sumo Logic Endpoints and Firewall Security](/docs/api/getting-started#sumo-logic-endpoints-by-deployment-and-firewall-security). - -```sql -access_id = "" -access_key = "" -environment = "" -``` - -The Terraform script installs the alerts without any scope filters; if you would like to restrict the alerts to specific clusters or environments, update the `varnish_data_source variable`. For example: - * To configure alerts for a specific cluster, set `varnish_data_source` to something like: `cache_cluster=varnish.prod.01` - * To configure alerts for All clusters in an environment, set `varnish_data_source` to something like: `environment=prod` - * To configure alerts for multiple clusters using a wildcard, set `varnish_data_source` to something like: `cache_cluster=varnish-prod*` - * To configure alerts for a specific cluster within a specific environment, set `varnish_data_source` to something like: `cache_cluster=varnish-1 and environment=prod`. This assumes you have configured and applied Fields as described in [Step 1: Configure Fields for Sumo Logic](#step-1-configure-fields-in-sumo-logic). - -All monitors are disabled by default on installation. To enable all of the monitors, set the `monitors_disabled` parameter to false. - -By default, the monitors will be located in a "Varnish" folder on the **Monitors** page. To change the name of the folder, update the monitor folder name in the folder variable in the `varnish.auto.tfvars` file. - -5. If you want the alerts to send email or connection notifications, edit the `varnish_notifications.auto.tfvars` file to populate the `connection_notifications` and `email_notifications` sections. Examples are provided below. - -```bash title="Pagerduty connection example" -connection_notifications = [ - { - connection_type = "PagerDuty", - connection_id = "", - payload_override = "{\"service_key\": \"your_pagerduty_api_integration_key\",\"event_type\": \"trigger\",\"description\": \"Alert: Triggered {{TriggerType}} for Monitor {{Name}}\",\"client\": \"Sumo Logic\",\"client_url\": \"{{QueryUrl}}\"}", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - }, - { - connection_type = "Webhook", - connection_id = "", - payload_override = "", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - } - ] -``` - -In the variable definition below, replace `` with the connection ID of the Webhook connection. You can obtain the Webhook connection ID by calling the [Monitors API](https://api.sumologic.com/docs/#operation/listConnections). - -For information about overriding the payload for different connection types, see [Set Up Webhook Connections](/docs/alerts/webhook-connections/set-up-webhook-connections). - -```bash title="Email notifications example" -email_notifications = [ - { - connection_type = "Email", - recipients = ["abc@example.com"], - subject = "Monitor Alert: {{TriggerType}} on {{Name}}", - time_zone = "PST", - message_body = "Triggered {{TriggerType}} Alert on {{Name}}: {{QueryURL}}", - run_for_trigger_types = ["Critical", "ResolvedCritical"] - } - ] -``` - -6. Install Monitors: - 1. Navigate to the `terraform-sumologic-sumo-logic-monitor/monitor_packages/varnish/` directory and run terraform init. This will initialize Terraform and download the required components. - 2. Run `terraform plan` to view the monitors that Terraform will create or modify. - 3. Run `terraform apply`. -7. This section demonstrates how to install the Varnish app. ## Installing the Varnish app -Locate and install the app you need from the **App Catalog**. If you want to see a preview of the dashboards included with the app before installing, click **Preview Dashboards**. +import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; + + -1. From the **App Catalog**, search for and select the app. -2. Select the version of the service you're using and click **Add to Library**. Version selection applies only to a few apps currently. For more information, see [Installing the Apps from the Library](/docs/get-started/apps-integrations). -3. To install the app, complete the following fields. - * **App Name.** You can retain the existing name or enter a name of your choice for the app. - * **Data Source.** Choose **Enter a Custom Data Filter**, and enter a custom Varnish cluster filter. Examples: - * For all Varnish clusters `cache_cluster=*` - * For a specific cluster: `cache_cluster=varnish.dev.01.` - * Clusters within a specific environment: `cache_cluster=varnish-1 and environment=prod`. This assumes you have set the optional environment tag while configuring collection. -4. **Advanced**. Select the **Location in the Library** (the default is the Personal folder in the library), or click **New Folder** to add a new folder. -5. Click **Add to Library**. +As part of the app installation process, the following fields will be created by default: +* `component` +* `environment` +* `cache_system` +* `cache_cluster` +* `pod` -Once an app is installed, it will appear in your **Personal** folder or another folder that you specified. From here, you can share it with your organization. +Additionally, if you're using Varnish in the Kubernetes environment, the following additional fields will be created by default during the app installation process: +* `pod_labels_component` +* `pod_labels_environment` +* `pod_labels_cache_system` +* `pod_labels_cache_cluster` -Panels will start to fill automatically. It's important to note that each panel slowly fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but you'll see full graphs and maps in a bit of time. ## Viewing Varnish dashboards -:::tip Filter with template variables -Template variables provide dynamic dashboards that can rescope data on the fly. As you apply variables to troubleshoot through your dashboard, you view dynamic changes to the data for a quicker resolution to the root cause. You can use template variables to drill down and examine the data on a granular level. For more information, see [Filter with template variables](/docs/dashboards/filter-template-variables.md). -::: +import ViewDashboards from '../../reuse/apps/view-dashboards.md'; + + ### Overview -The **Varnish - Overview** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on visitor geographic locations, traffic volume and distribution, responses over time, and time comparisons for visitor locations and uptime, cache hit, requests, VLC. +The **Varnish (Classic) - Overview** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on visitor geographic locations, traffic volume and distribution, responses over time, and time comparisons for visitor locations and uptime, cache hit, requests, VLC. Use this dashboard to: * Analyze Request backend, frontend, VLCs, Pool, Thread, VMODs, and cache hit rate. @@ -457,7 +317,7 @@ Use this dashboard to: ### Visitor Traffic Insight -The **Varnish - Visitor Traffic Insight** dashboard provides detailed information on the top documents accessed, top referrers, top search terms from popular search engines, and the media types served. +The **Varnish (Classic) - Visitor Traffic Insight** dashboard provides detailed information on the top documents accessed, top referrers, top search terms from popular search engines, and the media types served. Use this dashboard to: * Gain insights into visitor traffic. @@ -469,7 +329,7 @@ Use this dashboard to: ### Web Server Operations -The **Varnish - Web Server Operations** dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations, and data for clients with high error rates, server errors over time, and non 200 response code status codes. Dashboard panels also show server error logs, error log levels, error responses by the server, and the top URLs responsible for 404 responses. +The **Varnish (Classic) - Web Server Operations** dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations, and data for clients with high error rates, server errors over time, and non 200 response code status codes. Dashboard panels also show server error logs, error log levels, error responses by the server, and the top URLs responsible for 404 responses. Use this dashboard to: * Determine failures in responding. @@ -480,7 +340,7 @@ Use this dashboard to: ### Traffic Timeline Analysis -The **Varnish - Traffic Timeline Analysis** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on traffic volume and distribution, responses over time, as well as time comparisons for visitor locations and server hits. +The **Varnish (Classic) - Traffic Timeline Analysis** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on traffic volume and distribution, responses over time, as well as time comparisons for visitor locations and server hits. Use this dashboard to: * To understand the traffic distribution across servers, provide insights for resource planning by analyzing data volume and bytes served. @@ -490,7 +350,7 @@ Use this dashboard to: ### Outlier Analysis -The **Varnish - Outlier Analysis** dashboard provides a high-level view of Varnish server outlier metrics for bytes served, the number of visitors, and server errors. You can select the time interval over which outliers are aggregated, then hover the cursor over the graph to display detailed information for that point in time. +The **Varnish (Classic) - Outlier Analysis** dashboard provides a high-level view of Varnish server outlier metrics for bytes served, the number of visitors, and server errors. You can select the time interval over which outliers are aggregated, then hover the cursor over the graph to display detailed information for that point in time. Use this dashboard to: * Detect outliers in your infrastructure with Sumo Logic’s machine learning algorithm. @@ -500,7 +360,7 @@ Use this dashboard to: ### Threat Intel -The **Varnish - Threat Intel** dashboard provides an at-a-glance view of threats to Varnish servers on your network. Dashboard panels display threats count over a selected time period, geographic locations where threats occurred, source breakdown, actors responsible for threats, severity, and a correlation of IP addresses, method, and status code of threats. +The **Varnish (Classic) - Threat Intel** dashboard provides an at-a-glance view of threats to Varnish servers on your network. Dashboard panels display threats count over a selected time period, geographic locations where threats occurred, source breakdown, actors responsible for threats, severity, and a correlation of IP addresses, method, and status code of threats. Use this dashboard to: * To gain insights and understand threats in incoming traffic and discover potential IOCs. @@ -510,7 +370,7 @@ Use this dashboard to: ### Backend Servers -The **Varnish - Backend Servers** dashboard provides several metrics that describe the communication between Varnish and its backend servers. +The **Varnish (Classic) - Backend Servers** dashboard provides several metrics that describe the communication between Varnish and its backend servers. Use this dashboard to: * Review and manage the health of backend and frontend communication. @@ -519,7 +379,7 @@ Use this dashboard to: ### Bans and Bans Lurker -The **Varnish - Bans and Bans Lurker** dashboard provides you the list of Bans filters applied to keep Varnish from serving stale content. +The **Varnish (Classic) - Bans and Bans Lurker** dashboard provides you the list of Bans filters applied to keep Varnish from serving stale content. Use this dashboard to: * Gain insights into bans and make sure that Varnish is serving the latest content. @@ -529,7 +389,7 @@ Use this dashboard to: ### Cache Performance -The **Varnish - Cache Performance** dashboard provides worker thread related metrics to tell you if your thread pools are healthy and functioning well. +The **Varnish (Classic) - Cache Performance** dashboard provides worker thread related metrics to tell you if your thread pools are healthy and functioning well. Use this dashboard to: * Gain insights into the performance and health of Varnish Cache. @@ -539,7 +399,7 @@ Use this dashboard to: ### Clients -The **Varnish - Clients** dashboard check collects Varnish metrics regarding connections and requests. +The **Varnish (Classic) - Clients** dashboard check collects Varnish metrics regarding connections and requests. Use this dashboard to: * Review the current sessions and load on Varnish. @@ -549,16 +409,21 @@ Use this dashboard to: ### Threads -The **Varnish - Threads** Dashboard helps you to keep track of threads metrics to watch your Varnish Cache. +The **Varnish (Classic) - Threads** Dashboard helps you to keep track of threads metrics to watch your Varnish Cache. Use this dashboard to: * Manage and understand threads in the Varnish system. Varnish dashboard -## Varnish Alerts -Sumo Logic has provided out-of-the-box alerts available via[ Sumo Logic monitors](/docs/alerts/monitors) to help you quickly determine if the Varnish cache is available and performing as expected. +## Create monitors for Varnish app + +import CreateMonitors from '../../reuse/apps/create-monitors.md'; + + + +### Varnish Alerts | Alert Type (Metrics/Logs) | Alert Name | Alert Description | Trigger Type (Critical / Warning) | Alert Condition | Recover Condition | |:---|:---|:---|:---|:---|:---| From 0fba9110fafeb801291bfca1d5fe4af56b5868fa Mon Sep 17 00:00:00 2001 From: merge from main Date: Thu, 8 May 2025 16:46:53 +0530 Subject: [PATCH 2/8] updating iis 7 docs --- docs/integrations/microsoft-azure/iis-7.md | 36 ++++++++++++---------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/docs/integrations/microsoft-azure/iis-7.md b/docs/integrations/microsoft-azure/iis-7.md index f110969045..a139f595b0 100644 --- a/docs/integrations/microsoft-azure/iis-7.md +++ b/docs/integrations/microsoft-azure/iis-7.md @@ -169,31 +169,35 @@ After a few minutes, your new Source should be propagated down to the Collector ## Field Extraction Rules -* **Name**: Microsoft IIS Logs -* **Scope**: Use the source category set above, such as "IIS_prod" -* **Parse Expression:** -``` -parse regex "^[^#].*?(?\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) (?\S+?) -(?\S+?) (?\S+?) (?\d+?) (?\S+?) -(?.+?) (?\S+?) (?\S+?) (?\d+?) -(?\d+?) (?\d+?) (?\d+?)$" -``` +
+**FER to normalize the fields**. Field Extraction Rule named **AppObservabilityIIS7FER** is automatically created for IIS 7/8 Application Components. +
## Installing the IIS 7 App -Now that you have set up collection for IIS 7, install the Sumo Logic App for IIS 7 to use the preconfigured searches and dashboards that monitor log events generated by IIS 7. - -import AppInstall from '../../reuse/apps/app-install.md'; +import AppInstall from '../../reuse/apps/app-install-v2.md'; -## Viewing IIS 7 Dashboards +As part of the app installation process, the following fields will be created by default: +* `method` +* `cs_user_agent` +* `c_ip` +* `time_taken` +* `server_ip` +* `s_port` +* `sc_win32_status` +* `sc_status` +* `cs_uri_query` +* `sc_substatus` +* `cs_uri_stem` +* `cs_username` -**Each dashboard has a set of filters** that you can apply to the entire dashboard, as shown in the following example. Click the funnel icon in the top dashboard menu bar to display a scrollable list of filters that are applied across the entire dashboard. +## Viewing IIS 7 Dashboards -You can use filters to drill down and examine the data on a granular level. +import ViewDashboards from '../../reuse/apps/view-dashboards.md'; -**Each panel has a set of filters** that are applied to the results for that panel only, as shown in the following example. Click the funnel icon in the top panel menu bar to display a list of panel-specific filters. + ### Overview Dashboard From d568c1c428dff94d8e92818505d81f0bef4283d1 Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 10:15:14 +0530 Subject: [PATCH 3/8] Update strimzi-kafka.md --- .../containers-orchestration/strimzi-kafka.md | 142 +++++++++--------- 1 file changed, 71 insertions(+), 71 deletions(-) diff --git a/docs/integrations/containers-orchestration/strimzi-kafka.md b/docs/integrations/containers-orchestration/strimzi-kafka.md index f375ae3d4a..848adb3659 100644 --- a/docs/integrations/containers-orchestration/strimzi-kafka.md +++ b/docs/integrations/containers-orchestration/strimzi-kafka.md @@ -11,14 +11,14 @@ import TabItem from '@theme/TabItem'; icon -This guide provides an overview of Kafka metrics collection from kafka pods deployed with the Strimzi Kafka operator. +This guide provides an overview of Kafka metrics collection from Kafka pods deployed with the Strimzi Kafka operator. The Sumo Logic App for Strimzi Kafka is a unified logs and metrics app. The app helps you to monitor the availability, performance, and resource utilization of Kafka messaging/streaming clusters. Pre-configured dashboards provide insights into the cluster status, throughput, broker operations, topics, replication, zookeepers, node resource utilization, and error logs. -This App has been tested with following Kafka Operator versions: +This App has been tested with the following Kafka Operator versions: * 0.35.0 -This App has been tested with following Kafka versions: +This App has been tested with the following Kafka versions: * 3.4.0 @@ -57,13 +57,13 @@ This section provides instructions for configuring log and metric collection for ### Prerequisites for Kafka Cluster Deployment -Before configuring the collection you will require below items +Before configuring the collection, you will require the following items: -1. Access to the existing kubernetes cluster where strimzi cluster operator is deployed. If not done you can follow the strimzi [documentation](https://strimzi.io/docs/operators/latest/deploying.html#con-strimzi-installation-methods_str). +1. Access to the existing Kubernetes cluster where the Strimzi cluster operator is deployed. If not done, you can follow the Strimzi [documentation](https://strimzi.io/docs/operators/latest/deploying.html#con-strimzi-installation-methods_str). -2. Namespace where all the kafka pods will be created or are already deployed. +2. Namespace where all the Kafka pods will be created or deployed. -3. Download the [kafka-metrics-sumologic-telegraf.yaml](https://drive.google.com/file/d/1pvMqYiJu7_nEv2F2RsPKIn_WWs8BKcxQ/view?usp=sharing). If you already have an existing yaml, you will have to merge the contents of both the files. This file contains the Kafka resource. +3. Download the [kafka-metrics-sumologic-telegraf.yaml](https://drive.google.com/file/d/1pvMqYiJu7_nEv2F2RsPKIn_WWs8BKcxQ/view?usp=sharing). If you already have an existing yaml, you will have to merge the contents of both files. This file contains the Kafka resource. ### Deploying Sumo Logic Kubernetes Collection @@ -73,11 +73,11 @@ Before configuring the collection you will require below items kubectl create ns sumologiccollection ``` -2. Download [sumologic_values_eks.yaml](https://drive.google.com/file/d/1YYBmf2akxgfCjWSOdpO2nqf3KmpRc9y0/view?usp=sharing) file. This file contains the remote write configuration for metrics which the app uses. You can add or remove metrics depending upon your use case. +2. Download [sumologic_values_eks.yaml](https://drive.google.com/file/d/1YYBmf2akxgfCjWSOdpO2nqf3KmpRc9y0/view?usp=sharing) file. This file contains the remote write configuration for metrics that the app uses. You can add or remove metrics depending on your use case. -3. Generate the Sumo Logic access IDs and access keys in the Sumo Logic [portal](/docs/manage/security/access-keys/#create-an-access-key). +3. Generate the Sumo Logic access IDs and keys in the Sumo Logic [portal](/docs/manage/security/access-keys/#create-an-access-key). -4. Install the Sumo Logic Kubernetes Collection using **sumologic_values_eks.yaml** file (in folder) by following the instructions [here](/docs/send-data/kubernetes/install-helm-chart). Ensure that you are monitoring your Kubernetes clusters with the Telegraf operator enabled. If you are not, then follow [these instructions](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf/) to do so. The below command enables traces and telegraf operators using the credentials generated in the previous step and deploys the 2.10.0 helm chart version. +4. Install the Sumo Logic Kubernetes Collection using **sumologic_values_eks.yaml** file (in folder) by following the instructions [here](/docs/send-data/kubernetes/install-helm-chart). Ensure that you are monitoring your Kubernetes clusters with the Telegraf operator enabled. If you are not, then follow [these instructions](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf/) to do so. The following command enables traces and telegraf operators using the credentials generated in the previous step and deploys the 2.10.0 helm chart version. ```bash helm upgrade --install sumologic sumologic/sumologic \ @@ -90,20 +90,20 @@ Before configuring the collection you will require below items --version 2.10.0 -f sumologic_values_eks.yaml ``` - A collector will be created in your Sumo Logic org with cluster name, provided in above command. You can verify it by going to the [collection page](/docs/send-data/collection/). + A collector will be created in your Sumo Logic org with the cluster name provided in the above command. You can verify it by referring to the [collection page](/docs/send-data/collection/). ### Configure Metrics Collection Follow these steps to collect metrics from a Kubernetes environment: -1. **Preparing custom image for running Kafka with Jolokia agent** +1. **Preparing a custom image for running Kafka with Jolokia agent** - Strimzi operator does not support jolokia agent out of the box so we need to update the kafka image. To Build kafka pod image follow below instructions: + Strimzi operator does not support Jolokia agent out of the box, so we need to update the Kafka image. To build the Kafka pod image, follow the instructions below: 1. Download the [Dockerfile](https://drive.google.com/file/d/194cejvIotHyOuGjefUo_rOYaqAm2y9IM/view?usp=sharing). - 2. The above file uses container images available in the publicly available [Strimzi Container Registry](https://quay.io/organization/strimzi). Change the base image `quay.io/strimzi/kafka:latest-kafka-3.4.0` to the respective kafka version's image you want to use. - 3. Download the latest version of the **Jolokia JVM-Agent** from [Jolokia](https://jolokia.org/download.html), rename the file to `jolokia.jar` and place it in the same folder as dockerfile. - 4. Build the docker images using below commands. + 2. The above file uses container images available in the publicly available [Strimzi Container Registry](https://quay.io/organization/strimzi). Change the base image `quay.io/strimzi/kafka:latest-kafka-3.4.0` to the respective Kafka version's image you want to use. + 3. Download the latest version of the **Jolokia JVM-Agent** from [Jolokia](https://jolokia.org/download.html), rename the file to `jolokia.jar`, and place it in the same folder as the dockerfile. + 4. Build the Docker images using the following commands. ```bash docker build --platform -t "${MAIN_DOCKER_TAG}:${KAFKA_APP_TAG}" . docker tag "${MAIN_DOCKER_TAG}:${KAFKA_APP_TAG}" ${REGISTRY}/${REPOSITORY}:${KAFKA_APP_TAG}` @@ -115,7 +115,7 @@ Follow these steps to collect metrics from a Kubernetes environment: docker build --platform linux/amd64 -t "kafka:kafka-3.4.0" . docker tag "kafka:kafka-3.4.0" public.ecr.aws/g0d6f4n6/strimzi-kafka-jolokia:kafka-3.4.0 ``` - 5. Push the images in your container repository. Strimzi supports both private container registries as well as public registries.You can either configure image pull secrets at the [Cluster operator level](https://strimzi.io/docs/operators/latest/full/using.html#ref-operator-cluster-str) or in the [PodTemplate section](https://strimzi.io/docs/operators/latest/full/using.html#type-PodTemplate-reference). + 5. Push the images into your container repository. Strimzi supports both private container registries and public registries. You can either configure image pull secrets at the [Cluster operator level](https://strimzi.io/docs/operators/latest/full/using.html#ref-operator-cluster-str) or in the [PodTemplate section](https://strimzi.io/docs/operators/latest/full/using.html#type-PodTemplate-reference). 2. Update the `<>` in **kafka-metrics-sumologic-telegraf.yaml** file downloaded earlier. @@ -123,9 +123,9 @@ Follow these steps to collect metrics from a Kubernetes environment: 1. Open **kafka-metrics-sumologic-telegraf.yaml** in any editor and go to **spec -> kafka -> template -> pod -> metadata -> annotations** section. - 2. In the tags sections(`[inputs.jolokia2_agent.tags]` and `[inputs.disk.tags]`), enter in values for the parameters marked with `<>,<>` in the yaml file: + 2. In the tags sections(`[inputs.jolokia2_agent.tags]` and `[inputs.disk.tags]`), enter values for the parameters marked with `<>,<>` in the yaml file: - * `environment`. Replace `<>` with the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. + * `environment`. Replace `<>` with the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod, or qa. While this value is optional, we highly recommend setting it. * `messaging_cluster`. Replace `<>` with a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards. **Do not modify the following values** as it will cause the Sumo Logic app to not function correctly. @@ -137,7 +137,7 @@ Follow these steps to collect metrics from a Kubernetes environment: * `component: “messaging”` - This value is used by Sumo Logic apps to identify application components. * `messaging_system: “kafka”` - This value identifies the database system. * In the input plugins(`telegraf.influxdata.com/inputs`) section: - * `urls` - The URL to the Kafka server. As telegraf will be run as a sidecar the `urls` should always be localhost. This can be a comma-separated list to connect to multiple Kafka servers. + * `urls` - The URL to the Kafka server. As telegraf will be run as a sidecar, the `urls` should always be localhost. This can be a comma-separated list to connect to multiple Kafka servers. For more information on all other parameters, see [this doc](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) for more parameters that can be configured in the Telegraf agent globally. @@ -145,12 +145,12 @@ Follow these steps to collect metrics from a Kubernetes environment: ### Configure Logs Collection -If your Kafka helm chart/pod is writing the logs to standard output then the [Sumologic Kubernetes Collection](/docs/integrations/containers-orchestration/kubernetes/#collecting-metrics-and-logs-for-the-kubernetes-app) will automatically capture the logs from stdout and will send the logs to Sumologic.If not then you have to use [tailing-sidecar](https://github.com/SumoLogic/tailing-sidecar/blob/main/README.md) approach. +If your Kafka helm chart/pod is writing the logs to standard output, then the [Sumologic Kubernetes Collection](/docs/integrations/containers-orchestration/kubernetes/#collecting-metrics-and-logs-for-the-kubernetes-app) will automatically capture the logs from stdout and will send the logs to Sumologic. If no, then you have to use [tailing-sidecar](https://github.com/SumoLogic/tailing-sidecar/blob/main/README.md) approach. 1. **Add labels on your Kafka pods** 1. Open **kafka-metrics-sumologic-telegraf.yaml** in any editor and go to **spec -> kafka -> template -> pod -> metadata -> labels** section. - 2. Enter in values for the parameters marked with `<>,<>` in the yaml file: - * `environment`. Replace `<>` with the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. + 2. Enter values for the parameters marked with `<>,<>` in the yaml file: + * `environment`. Replace `<>` with the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod, or qa. While this value is optional, we highly recommend setting it. * `messaging_cluster`. Replace `<>` with a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards. * **Do not modify the following values** as it will cause the Sumo Logic app to not function correctly. @@ -158,9 +158,9 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su * `messaging_system: “kafka”` - This value identifies the messaging system. 2. **Collect Kafka logs written to log files (Optional)**. If your Kafka helm chart/pod is writing its logs to log files, you can use a [sidecar](https://github.com/SumoLogic/tailing-sidecar/tree/main/operator) to send log files to standard out. To do this: - 1. Determine the location of the Kafka log file on Kubernetes. This can be determined from helm chart configurations. + 1. Determine the location of the Kafka log file on Kubernetes. This can be determined from the helm chart configurations. 2. Install the Sumo Logic [tailing sidecar operator](https://github.com/SumoLogic/tailing-sidecar/tree/main/operator#deploy-tailing-sidecar-operator). - 3. Add the following annotation in addition to the existing annotations in **kafka-metrics-sumologic-telegraf.yaml** file. + 3. Add the following annotation to the existing annotations in **kafka-metrics-sumologic-telegraf.yaml** file. ```xml annotations: tailing-sidecar: sidecarconfig;:/` @@ -172,7 +172,7 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su ```
-**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments automatically are prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilityMessagingStrimziKafkaFER** is automatically created for Strimzi Kafka Application Components. +**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments are automatically prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilityMessagingStrimziKafkaFER** is automatically created for Strimzi Kafka Application Components.
Sumo Logic FER @@ -188,13 +188,13 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su ### Deployment Verification -1. Make sure that the Kafka pods are running and correct annotations / labels are applied by using the command: +1. Make sure that the Kafka pods are running and correct annotations/labels are applied by using the command: ```bash kubectl describe pod ``` 2. Verifying Metrics in Kafka pods - * You can ssh to Kafka pod and run following commands to make sure Telegraf (and Jolokia) is scraping metrics from your Kafka Pod: + * You can ssh to the Kafka pod and run the following commands to make sure Telegraf (and Jolokia) are scraping metrics from your Kafka Pod: ```bash curl localhost:9273/metrics curl http://localhost:8778/jolokia/list @@ -207,23 +207,23 @@ If your Kafka helm chart/pod is writing the logs to standard output then the [Su ### Troubleshooting -1. If you are still not seeing metrics or logs follow the [troubleshooting guide](https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/release-v3.7/docs/troubleshoot-collection.md). +1. If you are still not seeing metrics or logs, follow the [troubleshooting guide](https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/release-v3.7/docs/troubleshoot-collection.md). -2. If you are seeing unhealthy targets in prometheus console, sometimes custom [network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/) may be responsible for affecting connectivity across prometheus pods and kafka pods. +2. If you are seeing unhealthy targets in the Prometheus console, sometimes custom [network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/) may be responsible for affecting connectivity across Prometheus pods and Kafka pods. Prometheus Targets Console - Follow the below steps below to allow traffic from sumologiccollection namespace: + Follow the steps below to allow traffic from the **sumologiccollection** namespace: - 1. First we need to label the **sumologiccollection** namespace + 1. First, we need to label the **sumologiccollection** namespace ```bash kubectl label namespace sumologiccollection namespace=sumologiccollection ``` - You can validate above by below command + You can validate the above by the command given below: ```bash kubectl get namespaces sumologiccollection --show-labels ``` - 2. Then add below snippet in your existing network policy to allow all traffic from **sumologiccollection** namespace. We can restrict it from specific ports as well but let's first start by allowing all ports.Since **podSelector** works for only pods in the same namespace as network policy we are using **namespaceSelector** property. + 2. Then add the snippet given below in your existing network policy to allow all traffic from the **sumologiccollection** namespace. We can restrict it from specific ports as well, but let's first start by allowing all ports. Since **podSelector** works for only pods in the same namespace as the network policy, we are using the **namespaceSelector** property. ```yaml - from: - namespaceSelector: @@ -261,12 +261,12 @@ import ViewDashboards from '../../reuse/apps/view-dashboards.md'; ### Strimzi Kafka - Cluster Overview -The **Strimzi Kafka - Cluster Overview** dashboard gives you an at-a-glance view of your Kafka deployment across brokers, controllers, topics, partitions and zookeepers. +The **Strimzi Kafka - Cluster Overview** dashboard gives you an at-a-glance view of your Kafka deployment across brokers, controllers, topics, partitions, and zookeepers. Use this dashboard to: * Identify when brokers don’t have active controllers * Analyze trends across Request Handler Idle percentage metrics. Kafka’s request handler threads are responsible for servicing client requests ( read/write disk). If the request handler threads get overloaded, the time taken for requests to complete will be longer. If the request handler idle percent is constantly below 0.2 (20%), it may indicate that your cluster is overloaded and requires more resources. -* Determine the number of leaders, partitions and zookeepers across each cluster and ensure they match with expectations +* Determine the number of leaders, partitions, and zookeepers across each cluster and ensure they match with expectations Kafka dashboards @@ -276,7 +276,7 @@ Use this dashboard to: The **Strimzi Kafka - Outlier Analysis** dashboard helps you identify outliers for key metrics across your Kafka clusters. Use this dashboard to: -* To analyze trends, and quickly discover outliers across key metrics of your Kafka clusters +* To analyze trends and quickly discover outliers across key metrics of your Kafka clusters Kafka dashboards @@ -286,12 +286,12 @@ Use this dashboard to: The Strimzi Kafka - Replication dashboard helps you understand the state of replicas in your Kafka clusters. Use this dashboard to monitor the following key metrics: -* In-Sync Replicas (ISR) Expand Rate - The ISR Expand Rate metric displays the one-minute rate of increases in the number of In-Sync Replicas (ISR). ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker.The expected value for this rate is normally zero. -* In-Sync Replicas (ISR) Shrink Rate - The ISR Shrink Rate metric displays the one-minute rate of decreases in the number of In-Sync Replicas (ISR). ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker.The expected value for this rate is normally zero. - * ISR Shrink Vs Expand Rate - If you see a Spike in ISR Shrink followed by ISR Expand Rate - this may be because of nodes that have fallen behind replication and they may have either recovered or are in the process of recovering now. +* In-Sync Replicas (ISR) Expand Rate - The ISR Expand Rate metric displays the one-minute rate of increases in the number of In-Sync Replicas (ISR). ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker. The expected value for this rate is normally zero. +* In-Sync Replicas (ISR) Shrink Rate - The ISR Shrink Rate metric displays the one-minute rate of decreases in the number of In-Sync Replicas (ISR). ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker. The expected value for this rate is normally zero. + * ISR Shrink Vs Expand Rate - If you see a Spike in ISR Shrink followed by ISR Expand Rate, this may be because of nodes that have fallen behind replication, and they may have either recovered or are in the process of recovering now. * Failed ISR Updates * Under Replicated Partitions Count - * Under Min ISR Partitions Count -The Under Min ISR Partitions metric displays the number of partitions, where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified. The two most common causes of under-min ISR partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers are falling behind. + * Under Min ISR Partitions Count -The Under Min ISR Partitions metric displays the number of partitions, where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified. The two most common causes of under-min ISR partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues, and one or more brokers are falling behind. * The expected value for this rate is normally zero. Kafka dashboards @@ -299,12 +299,12 @@ Use this dashboard to monitor the following key metrics: ### Strimzi Kafka - Zookeeper -The **Strimzi Kafka -Zookeeper** dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput and network across Kafka brokers and clusters. +The **Strimzi Kafka -Zookeeper** dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput, and network across Kafka brokers and clusters. Use this dashboard to monitor key Zookeeper metrics such as: -* **Zookeeper disconnect rate** - This metric indicates if a Zookeeper node has lostits connection to a Kafka broker. +* **Zookeeper disconnect rate** - This metric indicates if a Zookeeper node has lost its connection to a Kafka broker. * **Authentication Failures** - This metric indicates a Kafka Broker is unable to connect to its Zookeeper node. -* **Session Expiration** - When a Kafka broker - Zookeeper node session expires, leader changes can occur and the broker can be assigned a new controller. If this metric is increasing we recommend you: +* **Session Expiration** - When a Kafka broker - Zookeeper node session expires, leader changes can occur, and the broker can be assigned a new controller. If this metric is increasing, we recommend you: 1. Check the health of your network. 2. Check for garbage collection issues and tune your JVMs accordingly. * Connection Rate. @@ -316,8 +316,8 @@ Use this dashboard to monitor key Zookeeper metrics such as: The Strimzi Kafka - Broker dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput, and network across Kafka brokers and clusters. Use this dashboard to: -* Monitor Under Replicaed and offline partitions to quickly identify if a Kafka broker is down or over utilized. -* Monitor Unclean Leader Election count metrics - this metric shows the number of failures to elect a suitable leader per second. Unclean leader elections are caused when there are no available in-sync replicas for a partition (either due to network issues, lag causing the broker to fall behind, or brokers going down completely), so an out of sync replica is the only option for the leader. When an out of sync replica is elected leader, all data not replicated from the previous leader is lost forever. +* Monitor Under Replicated and offline partitions to quickly identify if a Kafka broker is down or overutilized. +* Monitor Unclean Leader Election count metrics - this metric shows the number of failures to elect a suitable leader per second. Unclean leader elections are caused when there are no available in-sync replicas for a partition (either due to network issues, lag causing the broker to fall behind, or brokers going down completely), so an out-of-sync replica is the only option for the leader. When an out-of-sync replica is elected leader, all data not replicated from the previous leader is lost forever. * Monitor producer and fetch request rates. * Monitor Log flush rate to determine the rate at which log data is written to disk @@ -329,14 +329,14 @@ Use this dashboard to: The **Strimzi Kafka - Failures and Delayed Operations** dashboard gives you insight into all failures and delayed operations associated with your Kafka clusters. Use this dashboard to: -* Analyze failed produce requests - A failed produce request occurs when a problem is encountered when processing a produce request. This could be for a variety of reasons, however some common reasons are: - * The destination topic doesn’t exist (if auto-create is enabled then subsequent messages should be sent successfully). +* Analyze failed produce requests - A failed produce request occurs when a problem is encountered when processing a produce request. This could be for a variety of reasons, however, some common reasons are: + * The destination topic doesn’t exist (if auto-create is enabled, then subsequent messages should be sent successfully). * The message is too large. * The producer is using _request.required.acks=all_ or –_1_, and fewer than the required number of acknowledgements are received. -* Analyze failed Fetch Request - A failed fetch request occurs when a problem is encountered when processing a fetch request. This could be for a variety of reasons, but the most common cause is consumer requests timing out. +* Analyze failed Fetch Request - A failed fetch request occurs when a problem is encountered while processing a fetch request. This could be for a variety of reasons, but the most common cause is consumer requests timing out. * Monitor delayed Operations metrics - This contains metrics regarding the number of requests that are delayed and waiting in purgatory. The purgatory size metric can be used to determine the root cause of latency. For example, increased consumer fetch times could be explained by an increased number of fetch requests waiting in purgatory. Available metrics are: * Fetch Purgatory Size - The Fetch Purgatory Size metric shows the number of fetch requests currently waiting in purgatory. Fetch requests are added to purgatory if there is not enough data to fulfil the request (determined by fetch.min.bytes in the consumer configuration) and the requests wait in purgatory until the time specified by fetch.wait.max.ms is reached, or enough data becomes available. - * Produce Purgatory Size - The Produce Purgatory Size metric shows the number of produce requests currently waiting in purgatory. Produce requests are added to purgatory if request.required.acks is set to -1 or all, and the requests wait in purgatory until the partition leader receives an acknowledgement from all its followers. If the purgatory size metric keeps growing, some partition replicas may be overloaded. If this is the case, you can choose to increase the capacity of your cluster, or decrease the amount of produce requests being generated. + * Produce Purgatory Size - The Produce Purgatory Size metric shows the number of produce requests currently waiting in purgatory. Produce requests are added to purgatory if request.required.acks is set to -1 or all, and the requests wait in purgatory until the partition leader receives an acknowledgement from all its followers. If the purgatory size metric keeps growing, some partition replicas may be overloaded. If this is the case, you can choose to increase the capacity of your cluster or decrease the amount of produce requests being generated. Kafka dashboards @@ -346,9 +346,9 @@ Use this dashboard to: The **Strimzi Kafka - Request-Response** **Times** dashboard helps you get insight into key request and response latencies of your Kafka cluster. Use this dashboard to: -* Monitor request time metrics - The Request Metrics metric group contains information regarding different types of request to and from the cluster. Important request metrics to monitor: +* Monitor request time metrics - The Request Metrics metric group contains information regarding different types of requests to and from the cluster. Important request metrics to monitor: 1. **Fetch Consumer Request Total Time** - The Fetch Consumer Request Total Time metric shows the maximum and mean amount of time taken for processing, and the number of requests from consumers to get new data. Reasons for increased time taken could be: increased load on the node (creating processing delays), or perhaps requests are being held in purgatory for a long time (determined by fetch.min.bytes and fetch.wait.max.ms metrics). - 2. **Fetch Follower Request Total Time** - The Fetch Follower Request Total Time metric displays the maximum and mean amount of time taken while processing, and the number of requests to get new data from Kafka brokers that are followers of a partition. Common causes of increased time taken are increased load on the node causing delays in processing requests, or that some partition replicas may be overloaded or temporarily unavailable. + 2. **Fetch Follower Request Total Time** - The Fetch Follower Request Total Time metric displays the maximum and mean amount of time taken while processing, and the number of requests to get new data from Kafka brokers that are followers of a partition. Common causes of increased time taken are increased load on the node, causing delays in processing requests, or that some partition replicas may be overloaded or temporarily unavailable. 3. **Produce Request Total Time**- The Produce Request Total Time metric displays the maximum and mean amount of time taken for processing, and the number of requests from producers to send data. Some reasons for increased time taken could be: increased load on the node causing delays in processing the requests, or perhaps requests are being held in purgatory for a long time (if the `requests.required.acks` metrics is equal to '1' or all). Kafka dashboards @@ -358,8 +358,8 @@ Use this dashboard to: This dashboard helps you quickly analyze your Kafka error logs across all clusters. Use this dashboard to: -* Identify critical events in your Kafka broker and controller logs; -* Examine trends to detect spikes in Error or Fatal events +* Identify critical events in your Kafka broker and controller logs. +* Examine trends to detect spikes in Error or Fatal events. * Monitor Broker added/started and shutdown events in your cluster. * Quickly determine patterns across all logs in a given Kafka cluster. @@ -367,13 +367,13 @@ Use this dashboard to: ### Kafka Broker - Performance Overview -The **Kafka Broker - Performance Overview** dashboards helps you Get an at-a-glance view of the performance and resource utilization of your Kafka brokers and their JVMs. +The **Kafka Broker - Performance Overview** dashboards provide you at-a-glance view of the performance and resource utilization of your Kafka brokers and their JVMs. Use this dashboard to: * Monitor the number of open file descriptors. If the number of open file descriptors reaches the maximum file descriptor, it can cause an IOException error * Get insight into Garbage collection and its impact on CPU usage and memory * Examine how threads are distributed -* Understand the behavior of class count. If class count keeps on increasing, you may have a problem with the same classes loaded by multiple classloaders. +* Understand the behavior of the class count. If class count keeps increasing, you may have a problem with the same classes loaded by multiple classloaders. Kafka dashboards @@ -393,7 +393,7 @@ The **Kafka Broker - Memory** dashboard shows the percentage of the heap and non Use this dashboard to: * Understand how memory is used across Heap and Non-Heap memory. * Examine physical and swap memory usage and make resource adjustments as needed. -* Examine the pending object finalization count which when high can lead to excessive memory usage. +* Examine the pending object finalization count, which, when high, can lead to excessive memory usage. Kafka dashboards @@ -407,13 +407,13 @@ Use this dashboard to: 1. Topic replication factor of Kafka topics. 2. Log retention settings. * Analyze trends in disk throughput and find any spikes. This is especially important as disk throughput can be a performance bottleneck. -* Monitor iNodes bytes used, and disk read vs writes. These metrics are important to monitor as Kafka may not necessarily distribute data from a heavily occupied disk, which itself can bring the Kafka down. +* Monitor iNodes bytes used, and disk read vs writes. These metrics are important to monitor as Kafka may not necessarily distribute data from a heavily occupied disk, which itself can bring Kafka down. Kafka dashboards ### Kafka Broker - Garbage Collection -The **Kafka Broker - Garbage Collection** dashboard shows key Garbage Collector statistics like the duration of the last GC run, objects collected, threads used, and memory cleared in the last GC run of your java virtual machine. +The **Kafka Broker - Garbage Collection** dashboard shows key Garbage Collector statistics like the duration of the last GC run, objects collected, threads used, and memory cleared in the last GC run of your Java virtual machine. Use this dashboard to: * Understand the amount of time spent in garbage collection. If this time keeps increasing, your Kafka brokers may have more CPU usage. @@ -438,24 +438,24 @@ The **Kafka Broker - Class Loading and Compilation** dashboard helps you get ins Use this dashboard to: -* Determine If the class count keeps increasing, this indicates that the same classes are loaded by multiple classloaders. -* Get insights into time spent by Java Virtual machines during compilation. +* Determine if the class count keeps increasing; this indicates that the same classes are loaded by multiple classloaders. +* Get insights into the time spent by Java Virtual Machines during compilation. Kafka dashboards ### Strimzi Kafka - Topic Overview -The Strimzi Kafka - Topic Overview dashboard helps you quickly identify under-replicated partitions, and incoming bytes by Kafka topic, server and cluster. +The Strimzi Kafka - Topic Overview dashboard helps you quickly identify under-replicated partitions and incoming bytes by Kafka topic, server, and cluster. Use this dashboard to: -* Monitor under replicated partitions - The Under Replicated Partitions metric displays the number of partitions that do not have enough replicas to meet the desired replication factor. A partition will also be considered under-replicated if the correct number of replicas exist, but one or more of the replicas have fallen significantly behind the partition leader. The two most common causes of under-replicated partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers have fallen behind. +* Monitor Under Replicated partitions - The Under Replicated Partitions metric displays the number of partitions that do not have enough replicas to meet the desired replication factor. A partition will also be considered Under Replicated if the correct number of replicas exists, but one or more of the replicas have fallen significantly behind the partition leader. The two most common causes of under-replicated partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues, and one or more brokers have fallen behind. This metric is tagged with cluster, server, and topic info for easy troubleshooting. The colors in the Honeycomb chart are coded as follows: -1. Green indicates there are no under Replicated Partitions. -2. Red indicates a given partition is under replicated. +1. Green indicates there are no under-replicated partitions. +2. Red indicates that a given partition is under-replicated. Kafka dashboards @@ -463,11 +463,11 @@ Use this dashboard to: ### Strimzi Kafka - Topic Details -The Strimzi Kafka - Topic Details dashboard gives you insight into throughput, partition sizes and offsets across Kafka brokers, topics and clusters. +The Strimzi Kafka - Topic Details dashboard gives you insight into throughput, partition sizes, and offsets across Kafka brokers, topics, and clusters. Use this dashboard to: * Monitor metrics like Log partition size, log start offset, and log segment count metrics. -* Identify offline/under replicated partitions count. Partitions can be in this state on account of resource shortages or broker unavailability. +* Identify offline/under-replicated partitions count. Partitions can be in this state on account of resource shortages or broker unavailability. * Monitor the In Sync replica (ISR) Shrink rate. ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker. * Monitor In Sync replica (ISR) Expand rate. ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker. @@ -485,13 +485,13 @@ import CreateMonitors from '../../reuse/apps/create-monitors.md'; |:---------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|:-------------------| | Strimzi Kafka - High Broker Disk Utilization | This alert fires when we detect that a disk on a broker node is more than 85% full. | `>=`85 | < 85 | | Strimzi Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures. | | | -| Strimzi Kafka - High Leader election rate | This alert fires when we detect high leader election rate. | | | -| Strimzi Kafka - Garbage collection | This alert fires when we detect that the average Garbage Collection time on a given Kafka broker node over a 5 minute interval is more than one second. | > = 1 | < 1 | +| Strimzi Kafka - High Leader election rate | This alert fires when we detect a high leader election rate. | | | +| Strimzi Kafka - Garbage collection | This alert fires when we detect that the average Garbage Collection time on a given Kafka broker node over a 5-minute interval is more than one second. | > = 1 | < 1 | | Strimzi Kafka - Offline Partitions | This alert fires when we detect offline partitions on a given Kafka broker. | | | | Strimzi Kafka - Fatal Event on Broker | This alert fires when we detect a fatal operation on a Kafka broker node | `>=`1 | `<`1 | | Strimzi Kafka - Underreplicated Partitions | This alert fires when we detect underreplicated partitions on a given Kafka broker. | | | | Strimzi Kafka - Large number of broker errors | This alert fires when we detect that there are 5 or more errors on a Broker node within a time interval of 5 minutes. | | | -| Strimzi Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for an interval of 5 minutes. | | | +| Strimzi Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for 5 minutes. | | | | Strimzi Kafka - Out of Sync Followers | This alert fires when we detect that there are Out of Sync Followers within a time interval of 5 minutes. | | | -| Strimzi Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5 minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | +| Strimzi Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5-minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | From e19ce9e0304b6faa5bb59a17af3f1cbcdd322f8b Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 10:24:24 +0530 Subject: [PATCH 4/8] Update iis-7.md --- docs/integrations/microsoft-azure/iis-7.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/integrations/microsoft-azure/iis-7.md b/docs/integrations/microsoft-azure/iis-7.md index a139f595b0..2eeea8543b 100644 --- a/docs/integrations/microsoft-azure/iis-7.md +++ b/docs/integrations/microsoft-azure/iis-7.md @@ -9,7 +9,7 @@ import useBaseUrl from '@docusaurus/useBaseUrl'; thumbnail icon -The IIS 7 App monitors the performance and reliability of your Microsoft Internet Information Services (IIS) infrastructure, identifying customer-facing and internal operational issues. Additionally, you can monitor customer paths and interactions to learn how customers are using your product. The app consists of predefined searches and Dashboards, which provide visibility into your environment for real time or historical analysis. +The IIS 7 App monitors the performance and reliability of your Microsoft Internet Information Services (IIS) infrastructure, identifying customer-facing and internal operational issues. Additionally, you can monitor customer paths and interactions to learn how customers are using your product. The app consists of predefined searches and Dashboards, which provide visibility into your environment for real-time or historical analysis. ## Log types @@ -54,7 +54,7 @@ For more information about the IIS 7 log (IIS 7.5 logs are used) format, see [ht The following query samples are taken from the IIS 7 App. -The following query is taken from the the **Requests by App Over Time** panel on the **IIS 7 Traffic Insights - App Requests Dashboard**. +The following query is taken from the **Requests by App Over Time** panel on the **IIS 7 Traffic Insights - App Requests Dashboard**. ```sql title="Requests by App Over Time" _sourceCategory=IIS* @@ -104,7 +104,7 @@ This procedure explains how to enable logging from Microsoft Internet Informatio To prepare for logging IIS 7 events, perform the following two tasks. To enable logging on your IIS Server, do the following: -1. Open the Sever Manager Console +1. Open the Server Manager Console 2. Select **Roles** 3. Select **Web Server (IIS)** 4. Select the host from which to collect IIS logs @@ -152,8 +152,8 @@ To collect logs from IIS 7, use an Installed Collector and a Local File Source. 1. **Name**: Required (for example, "IIS") 2. **Description**. (Optional) 3. **File Path** (Required).`C:\inetpub\Logs\LogFiles\W3SVC1\*.log` - 4. **Collection start time**. Choose how far back you would like to begin collecting historical logs. For example, choose 7 days ago to being collecting logs with a last modified date within the last seven days. - 5. **Source Host**. Sumo Logic uses the hostname assigned by the operating system by default, but you can enter a different host name. + 4. **Collection start time**. Choose how far back you would like to begin collecting historical logs. For example, choose 7 days ago to begin collecting logs with a last modified date within the last seven days. + 5. **Source Host**. Sumo Logic uses the hostname assigned by the operating system by default, but you can enter a different hostname. 6. **Source Category** (Required). For example, "IIS_prod". (The Source Category metadata field is a fundamental building block to organize and label Sources. For details, see [Best Practices](/docs/send-data/best-practices).) 3. Configure the **Advanced** section: 7. **Timestamp Parsing Settings**: Make sure the setting matches the timezone on the log files. @@ -161,7 +161,7 @@ To collect logs from IIS 7, use an Installed Collector and a Local File Source. 9. **Time Zone**: Select the option to **Use time zone from log file. If none is present use:** and set the timezone to **UTC**. 10. **Timestamp Format**: Select the option to **Automatically detect the format**. 11. **Encoding**. UTF-8 is the default, but you can choose another encoding format from the menu if your IIS logs are encoded differently. - 12. **Enable Multiline Processing**. Disable the option to Detect messages spanning multiple lines. Because IIS logs are single line log files, disabling this option will improve performance of the collection and ensure that your messages are submitted correctly to Sumo Logic. + 12. **Enable Multiline Processing**. Disable the option to detect messages spanning multiple lines. Because IIS logs are single-line log files, disabling this option will improve the performance of the collection and ensure that your messages are submitted correctly to Sumo Logic. 4. Click **Save**. After a few minutes, your new Source should be propagated down to the Collector and will begin submitting your IIS log files to the Sumo Logic service. @@ -230,6 +230,6 @@ The IIS 7 - Traffic Insights - Content and Client Platform Dashboard provides in ### Visitor Insights -The **IIS 7 - Visitor Insights Dashboard** provides information on the geographic locations and number of users by client IP address, the number of visitors per country, locations and number of users by client IP address by US state, and the number of visitors per US state. +The **IIS 7 - Visitor Insights Dashboard** provides information on the geographic locations and number of users by client IP address, the number of visitors per country, locations, and number of users by client IP address by US state, and the number of visitors per US state. Visitor Insights From bc7964c92048e1950345df823a836c7242b9306b Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 10:45:51 +0530 Subject: [PATCH 5/8] Update squid-proxy.md --- docs/integrations/web-servers/squid-proxy.md | 103 ++++++++----------- 1 file changed, 45 insertions(+), 58 deletions(-) diff --git a/docs/integrations/web-servers/squid-proxy.md b/docs/integrations/web-servers/squid-proxy.md index 4ac4913014..fc892c2f8c 100644 --- a/docs/integrations/web-servers/squid-proxy.md +++ b/docs/integrations/web-servers/squid-proxy.md @@ -11,19 +11,17 @@ import TabItem from '@theme/TabItem'; Thumbnail icon -The Squid Proxy app is a unified logs and metrics app that helps you monitor activity in Squid Proxy. The preconfigured dashboards provide insight into served and denied requests; performance metrics; IP domain DNS statistics; traffic details; HTTP response codes; URLs experiencing redirects, client errors, and server errors; and quality of service data that helps you understand your users’ experience. +The Squid Proxy app is a unified logs and metrics app that helps you monitor activity in Squid Proxy. The preconfigured dashboards provide insight into served and denied requests, performance metrics, IP domain DNS statistics, traffic details, HTTP response codes, URLs experiencing redirects, client errors, server errors, and quality of service data that helps you understand your users’ experience. This app is tested with the following Squid Proxy versions: * For Kubernetes environments: Squid Proxy version: 6.0.0 * Non-Kubernetes environments: Squid Proxy version: 6.0.0 - ## Collecting logs and metrics for the Squid Proxy app This section provides instructions for configuring log and metric collection for the Sumo Logic app for Squid Proxy. - -### Configure Logs and Metrics Collection for Squid Proxy +### Configure logs and metrics collection for Squid Proxy Sumo Logic supports the collection of logs and metrics data from Squid Proxy in both Kubernetes and non-Kubernetes environments. @@ -43,24 +41,24 @@ In Kubernetes environments, we use the Telegraf Operator, which is packaged with Squid Proxy -The first service in the pipeline is Telegraf. Telegraf collects metrics from Squid Proxy. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment: i.e. Telegraf runs in the same pod as the containers it monitors. Telegraf uses the [SNMP input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp) to obtain metrics. For simplicity, the diagram doesn’t show the input plugins. +The first service in the pipeline is Telegraf. Telegraf collects metrics from Squid Proxy. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment: i.e., Telegraf runs in the same pod as the containers it monitors. Telegraf uses the [SNMP input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp) to obtain metrics. For simplicity, the diagram doesn’t show the input plugins. The injection of the Telegraf sidecar container is done by the Telegraf Operator. Prometheus pulls metrics from Telegraf and sends them to [Sumo Logic Distribution for OpenTelemetry Collector](https://github.com/SumoLogic/sumologic-otel-collector), which enriches metadata and sends metrics to Sumo Logic. In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic. :::note Prerequisites -It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). +It’s assumed that you are using the latest Helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). ::: -### Configure Metrics Collection +### Configure metrics collection This section explains the steps to collect Squid Proxy metrics from a Kubernetes environment. In Kubernetes environments, we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more on this[ here](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/telegraf-collection-architecture). Follow the steps listed below to collect metrics from a Kubernetes environment: 1. [Set up Kubernetes Collection with the Telegraf Operator](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) -2. Enable SNMP agent on Squid Proxy. By default, the [SNMP agent](https://wiki.squid-cache.org/Features/Snmp) will be disabled on squid proxy. You have to enable it. To enable the SNMP agent on squid, edit the configuration file of the squid proxy (squid.conf) and add the following section in ConfigMap that mounted to Squid Proxy pods: +2. Enable the SNMP agent on Squid Proxy. By default, the [SNMP agent](https://wiki.squid-cache.org/Features/Snmp) will be disabled on the Squid Proxy. To enable the SNMP agent on Squid, edit the configuration file of the Squid Proxy (squid.conf) and add the following section in the ConfigMap that is mounted to Squid Proxy pods: ```bash acl snmppublic snmp_community public snmp_port 3401 @@ -245,16 +243,16 @@ annotations: If you haven’t defined a farm in Squid Proxy, then enter ‘**default**’ for `proxy_cluster`. Enter in values for the following parameters (marked `CHANGEME` in the snippet above): -* `telegraf.influxdata.com/inputs`. This contains the required configuration for the Telegraf SNMP Input plugin. Please refer[ to this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/redis) for more information on configuring the SNMP input plugin for Telegraf. Note: As telegraf will be run as a sidecar the host should always be localhost. +* `telegraf.influxdata.com/inputs`. This contains the required configuration for the Telegraf SNMP Input plugin. Please refer to this [doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/redis) for more information on configuring the SNMP input plugin for Telegraf. Note: As Telegraf will be run as a sidecar, the host should always be localhost. * In the tags section, which is `[inputs.snmp.tags]` - * `environment`. This is the deployment environment where the Squid Proxy cluster identified by the value of **servers** resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. + * `environment`. This is the deployment environment where the Squid Proxy cluster identified by the value of **servers** resides. For example: dev, prod, or qa. While this value is optional, we highly recommend setting it. * `proxy_cluster`. Enter a name to identify this Squid Proxy cluster. This farm name will be shown in the Sumo Logic dashboards. -**Do not modify** the following values set by this configuration as it will cause the Sumo Logic app to not function correctly. +**Do not modify** the following values set by this configuration, as it will cause the Sumo Logic app to not function correctly. * `telegraf.influxdata.com/class: sumologic-prometheus`. This instructs the Telegraf operator what output to use. This should not be changed. * `prometheus.io/scrape: "true"`. This ensures our Prometheus will scrape the metrics. -* `prometheus.io/port: "9273"`. This tells prometheus what ports to scrape on. This should not be changed. +* `prometheus.io/port: "9273"`. This tells Prometheus what ports to scrape on. This should not be changed. * `telegraf.influxdata.com/inputs` * In the tags section, which is `[inputs.snmp.tags]` * `component: “proxy”`. This value is used by Sumo Logic apps to identify application components. @@ -265,12 +263,12 @@ Enter in values for the following parameters (marked `CHANGEME` in the snippet a 4. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step. 5. Verify metrics in Sumo Logic. -### Configure Logs Collection +### Configure logs collection This section explains the steps to collect Squid Proxy logs from a Kubernetes environment. 1. **(Recommended Method) Add labels on your Squid Proxy pods to capture logs from standard output**. Make sure that the logs from Squid Proxy are sent to stdout. Follow the instructions below to capture Squid Proxy logs from stdout on Kubernetes. - 1. Apply following labels to the Squid Proxy pod: + 1. Apply the following labels to the Squid Proxy pod: ```sql environment="prod_CHANGEME" component="proxy" @@ -278,10 +276,10 @@ This section explains the steps to collect Squid Proxy logs from a Kubernetes en proxy_cluster="" ``` Enter in values for the following parameters (marked **CHANGE_ME** above): - * `environment`. This is the deployment environment where the Squid Proxy cluster identified by the value of `servers` resides. For example:- dev, prod, or QA. While this value is optional we highly recommend setting it. + * `environment`. This is the deployment environment where the Squid Proxy cluster identified by the value of `servers` resides. For example:- dev, prod, or QA. While this value is optional, we highly recommend setting it. * `proxy_cluster`. Enter a name to identify this Squid Proxy cluster. This farm name will be shown in the Sumo Logic dashboards. If you haven’t defined a cluster in Squid Proxy, then enter `default` for `proxy_cluster`. - **Do not modify** the following values set by this configuration as it will cause the Sumo Logic app to not function correctly. + **Do not modify** the following values set by this configuration, as it will cause the Sumo Logic app to not function correctly. * `component: “proxy”`. This value is used by Sumo Logic apps to identify application components. * `proxy_system: “squidproxy”` - This value identifies the proxy system. @@ -291,7 +289,7 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data 3. Verify logs in Sumo Logic. 2. **(Optional) Collecting Squid Proxy Logs from a Log File** Follow the steps below to capture Squid Proxy logs from a log file on Kubernetes. - 1. Determine the location of the Squid Proxy log file on Kubernetes. This can be determined from the squid.conf for your Squid Proxy cluster along with the mounts on the Squid Proxy pods. + 1. Determine the location of the Squid Proxy log file on Kubernetes. This can be determined from the squid.conf for your Squid Proxy cluster, along with the mounts on the Squid Proxy pods. 2. Install the Sumo Logic [tailing sidecar operator](https://github.com/SumoLogic/tailing-sidecar/tree/main/operator#deploy-tailing-sidecar-operator). 3. Add the following annotation in addition to the existing annotations. ```xml @@ -311,10 +309,9 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data 6. Verify logs in Sumo Logic.
-**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments automatically are prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilitySquidProxyFER** is automatically created for Squid Proxy Application Components. +**FER to normalize the fields in Kubernetes environments**. Labels created in Kubernetes environments are automatically prefixed with `pod_labels`. To normalize these for our app to work, a Field Extraction Rule named **AppObservabilitySquidProxyFER** is automatically created for Squid Proxy Application Components.
- @@ -324,12 +321,12 @@ Sumo Logic uses the Telegraf operator for Squid Proxy metric collection and the The process to set up collection for Squid Proxy data is done through the following steps. -### Configure Logs Collection +### Configure logs collection Squid Proxy app supports the default access logs and cache logs format. -1. **Configure logging in Squid Proxy.** By default, the squid proxy will write the access log to the log directory that was configured during installation. For example, on Linux, the log directory would be `/var/log/squid/access.log`. If the access log is disabled then you must enable the access log following [these instructions](https://wiki.squid-cache.org/SquidFaq/SquidLogs). -2. **Configure an Installed Collector.** If you have not already done so, install and configure an installed collector for Windows by [following the documentation](/docs/send-data/installed-collectors/windows). +1. **Configure logging in Squid Proxy.** By default, the squid proxy will write the access log to the log directory that was configured during installation. For example, on Linux, the log directory would be `/var/log/squid/access.log`. If the access log is disabled, then you must enable the access log following [these instructions](https://wiki.squid-cache.org/SquidFaq/SquidLogs). +2. **Configure an Installed Collector.** If you have not already done so, install and configure an installed collector for Windows by referring to this [documentation](/docs/send-data/installed-collectors/windows). 3. **Configure a Collector**. Use one of the following Sumo Logic Collector options: 1. To collect logs directly from the Squid Proxy machine, configure an [Installed Collector](/docs/send-data/installed-collectors). 2. If you're using a service like Fluentd, or you would like to upload your logs manually, [Create a Hosted Collector](/docs/send-data/hosted-collectors/configure-hosted-collector.md). @@ -361,7 +358,7 @@ environment = #For example, Dev, QA, or Prod * Encoding. Select UTF-8 (Default). * Enable Multiline Processing. * Error logs. Select Detect messages spanning multiple lines and Infer Boundaries - Detect message boundaries automatically. - * Access logs. These are single-line logs, uncheck Detect messages spanning multiple lines. + * Access logs. These are single-line logs; uncheck Detect messages spanning multiple lines. 4. Click Save. @@ -389,8 +386,7 @@ If you're using a service like Fluentd, or you would like to upload your logs ma - -### Configure Metrics Collection +### Configure metrics collection 1. **Set up a Sumo Logic HTTP Source**. 1. Configure a Hosted Collector for Metrics. To create a new Sumo Logic hosted collector, perform the steps in the [Create a Hosted Collector](/docs/send-data/hosted-collectors/configure-hosted-collector.md) documentation. @@ -402,15 +398,15 @@ If you're using a service like Fluentd, or you would like to upload your logs ma * **Source Category** (Recommended). Be sure to follow the [Best Practices for Source Categories](/docs/send-data/best-practices). A recommended Source Category may be Prod/ProxyServer/SquidProxy/Metrics. 3. Select **Save**. 4. Take note of the URL provided once you click _Save_. You can retrieve it again by selecting the **Show URL** next to the source on the Collection Management screen. -2. **Enable SNMP agent on Squid Proxy**. By default, the [SNMP agent](https://wiki.squid-cache.org/Features/Snmp) will be disabled on squid proxy. You have to enable it. To enable the SNMP agent on squid, edit the configuration file of the squid proxy (squid.conf) and add the following section: +2. **Enable SNMP agent on Squid Proxy**. By default, the [SNMP agent](https://wiki.squid-cache.org/Features/Snmp) will be disabled on the Squid Proxy. You have to enable it. To enable the SNMP agent on squid, edit the configuration file of the squid proxy (squid.conf) and add the following section: ```bash acl snmppublic snmp_community public snmp_port 3401 snmp_access allow snmppublic localhost ``` 3. **Set up Telegraf**. - 1. Install Telegraf if you haven’t already, using the [following steps](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf.md). - 2. Configure and start Telegraf: as part of collecting metrics data from Telegraf, we'll use the [SNMP input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp) to get data from Telegraf and the [Sumo Logic output plugin](https://github.com/SumoLogic/fluentd-output-sumologic) to send data to Sumo Logic. + 1. Install Telegraf if you haven’t already, using the steps given in this [documentation](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf.md). + 2. Configure and start Telegraf. As part of collecting metrics data from Telegraf, we'll use the [SNMP input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp) to get data from Telegraf and the [Sumo Logic output plugin](https://github.com/SumoLogic/fluentd-output-sumologic) to send data to Sumo Logic.
Click to expand. Create or modify `telegraf.conf` and copy and paste the text below: @@ -586,11 +582,11 @@ If you're using a service like Fluentd, or you would like to upload your logs ma * Enter values for fields annotated with `` to the appropriate values. Do not include the brackets (`< >`) in your final configuration. * In the tags section, which is `[inputs.snmp.tags]`: - * `environment`. This is the deployment environment where the Squid Proxy server identified by the value of servers resides. For example; dev, prod, or QA. While this value is optional we highly recommend setting it. + * `environment`. This is the deployment environment where the Squid Proxy server identified by the value of servers resides. For example, dev, prod, or QA. While this value is optional, we highly recommend setting it. * `proxy_cluster`. Enter a name to identify this Squid Proxy cluster. This cluster name will be shown in our dashboards. * In the output plugins section, which is `[[outputs.sumologic]]`: * `URL` - This is the HTTP source URL created previously. See this doc for more information on additional parameters for configuring the Sumo Logic Telegraf output plugin. - * **Do not modify** the following values set by this configuration as it will cause the Sumo Logic app to not function correctly. If you haven’t defined a cluster in Squid Proxy, then enter `default` for `proxy_cluster`. + * **Do not modify** the following values set by this configuration, as it will cause the Sumo Logic app to not function correctly. If you haven’t defined a cluster in Squid Proxy, then enter `default` for `proxy_cluster`. * `data_format: “prometheus”`. In the output `[[outputs.sumologic]]` plugins section. Metrics are sent in the Prometheus format to Sumo Logic. * `component - “proxy”`. In the input `[[inputs.snmp]]` plugins section. This value is used by Sumo Logic apps to identify application components. * `proxy_system - “squidproxy”`. In the input plugins sections. This value identifies the proxy system. @@ -618,12 +614,9 @@ If you're using a service like Fluentd, or you would like to upload your logs ma ``` 5. Click **Save** to create the rule. - - - ## Installing the Squid Proxy app import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; @@ -643,7 +636,7 @@ Additionally, if you're using Squid Proxy in the Kubernetes environment, the fol * `pod_labels_proxy_system` * `pod_labels_proxy_cluster` -## Viewing the Squid Proxy Dashboards +## Viewing the Squid Proxy dashboards import ViewDashboards from '../../reuse/apps/view-dashboards.md'; @@ -651,74 +644,69 @@ import ViewDashboards from '../../reuse/apps/view-dashboards.md'; ### Overview -The **Squid Proxy (Classic) - Overview** dashboard provides an at-a-glance view of the activity and health of the SquidProxy clusters and servers by monitoring uptime, number of current clients, latency, bandwidth, destination locations, error and denied requests, URLs accessed. +The **Squid Proxy (Classic) - Overview** dashboard provides an at-a-glance view of the activity and health of the Squid Proxy clusters and servers by monitoring uptime, number of current clients, latency, bandwidth, destination locations, error and denied requests, and URLs accessed. Use this dashboard to: * Gain insights into information about the destination location your intranet frequently visits by region. -* Gain insights into your Squid Proxy health using Latency, HTTP Errors, Status codes of Squid Proxy Servers. -* Get insights into information about Uptime and bandwidth of Squid Proxy servers. +* Gain insights into your Squid Proxy health using Latency, HTTP Errors, and Status codes of Squid Proxy Servers. +* Get insights into information about the Uptime and bandwidth of Squid Proxy servers. * Get insights into information about the web browsing behavior of users using Top accessed URLs, denied URLs, 4xx errors URLs, 5xx errors URLs, and top remote hosts. Squid Proxy - ### Protocol -The **Squid Proxy (Classic) - Protocol** dashboard provides an insight into the protocols of clusters: the number of HTTP requests, HTTP errors, total bytes transferred, the number of HTTP requests per second, the number of HTTP's bytes per second. +The **Squid Proxy (Classic) - Protocol** dashboard provides insight into the protocols of clusters: the number of HTTP requests, HTTP errors, total bytes transferred, the number of HTTP requests per second, and the number of HTTP bytes per second. Use this dashboard to: * Get detailed information about the total number of requests from clients, the total number of HTTP errors sent to clients, the total number of bytes transferred on servers, total number of bytes sent to clients -* Get insights into information about HTTP requests, HTTP errors, bandwidth transferred over time. +* Get insights into information about HTTP requests, HTTP errors, and bandwidth transferred over time. Squid Proxy - ### Performance -The **Squid Proxy (Classic) - Performance** dashboard provides an insight into the workload of clusters, the number of page faults IO, percent of file descriptor used, number of memory used, the time for all HTTP requests, the number of objects in the cache, the CPU time. +The **Squid Proxy (Classic) - Performance** dashboard provides insight into the workload of clusters, the number of page faults, the percent of file descriptors used, the number of memory used, the time for all HTTP requests, the number of objects in the cache, and the CPU time. Use this dashboard to: -* Gain insights into the workload of squid proxy servers such as percent of file descriptors used, memory usage, CPU time consumed. -* Gain insights into the read and write status of squid proxy servers such as Page Faults IO, HTTP I/O number of reading, the number of objects stored, the average of time response. +* Gain insights into the workload of squid proxy servers, such as the percentage of file descriptors used, memory usage, and CPU time consumed. +* Gain insights into the read and write status of Squid Proxy servers, such as Page Faults IO, HTTP I/O number of reading, the number of objects stored, and the average response time. Squid Proxy - ### IP Domain DNS Statistics -The **Squid Proxy (Classic) - IP Domain DNS Statistics** dashboard provides a high-level view of the number of IPs, the number of FQDN, rate requests cache according to FQDN, rate requests cache according to IPs, the number of DNS queries, time for DNS query. +The **Squid Proxy (Classic) - IP Domain DNS Statistics** dashboard provides a high-level view of the number of IPs, the number of FQDNs, the rate of requests cached according to FQDNs, the rate of requests cached according to IPs, the number of DNS queries, and the time for DNS queries. Use this dashboard to: -* Gain insights into IPs accessed statistics: IP Cache Entries, Number and rate of IP Cache requests, Number and rate of IP Cache hits. +* Gain insights into IP access statistics: IP Cache Entries, Number and rate of IP Cache requests, Number and rate of IP Cache hits. * Gain insights into Domain Name (FQDN) statistics: FQDN Cache Entries, Number of FQDN Cache misses, Number and rate of FQDN Cache requests, Number of FQDN Cache Negative Hits. -* Gain insights into DNS Lookup statistics: Number of External DNS Server Requests, Average Time For DNS Service, Number of External DNS Server Replies. +* Gain insights into DNS Lookup statistics: Number of External DNS Server Requests, Average Time for the DNS Service, Number of External DNS Server Replies. Squid Proxy ### Activity Trend -The **Squid Proxy (Classic) - Activity Trend** dashboard provides trends around denied request trend, action trend, time spent to serve, success and non-success response, remote hosts. +The **Squid Proxy (Classic) - Activity Trend** dashboard provides trends around denied request trend, action trend, time spent to serve, success and non-success response, and remote hosts. Use this dashboard to: -* Gain insights into the average amount of time it takes to serve a request and the kind of method the request was. -* Gain insights into the average time spent to serve requests, the megabytes served, the trends in requests by actions, the count of successful 2xx and non 2xx response actions. -* Gain insights into the trends in the number of denied requests, the remote hosts traffic by requests, the remote hosts traffic by data volume. +* Gain insights into the average time it takes to serve a request and the kind of method the request was. +* Gain insights into the average time spent to serve requests, the megabytes served, the trends in requests by actions, and the count of successful 2xx and non-2xx response actions. +* Gain insights into the trends in the number of denied requests, the remote hosts' traffic by requests, and the remote hosts' traffic by data volume. Squid Proxy - ### HTTP Response Analysis -The **Squid Proxy (Classic) - HTTP Response Analysis** dashboard provides insights into HTTP response, HTTP code, the number of client errors, server errors, redirections outlier, URLs experiencing server errors. +The **Squid Proxy (Classic) - HTTP Response Analysis** dashboard provides insights into HTTP response, HTTP code, the number of client errors, server errors, redirection outliers, and URLs experiencing server errors. Use this dashboard to: * Gain insights into the count of HTTP responses, such as redirections, successes, client errors, or server errors, on an area chart. * Gain insights into client error URLs with information fields: URL, status code, and event count. -* Get detailed information on any outliers in redirection, client error, server error events on a line chart with thresholds. +* Get detailed information on any outliers in redirection, client error, and server error events on a line chart with thresholds. Squid Proxy - ### Quality of Service The **Squid Proxy (Classic) - Quality of Service** dashboard provides insights into latency, the response time of requests according to HTTP action, and the response time according to location. @@ -729,14 +717,13 @@ Use this dashboard to: Squid Proxy - ## Create monitors for Squid Proxy app import CreateMonitors from '../../reuse/apps/create-monitors.md'; -### Squid Proxy Alerts +### Squid Proxy alerts | Alert Type (Metrics/Logs) | Alert Name | Alert Description | Trigger Type (Critical / Warning) | Alert Condition | Recover Condition | |:---|:---|:---|:---|:---|:---| From e7b328fa409e1e0256107328893528ecb701c73e Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 10:56:46 +0530 Subject: [PATCH 6/8] Update varnish.md --- docs/integrations/web-servers/varnish.md | 96 +++++++++++------------- 1 file changed, 44 insertions(+), 52 deletions(-) diff --git a/docs/integrations/web-servers/varnish.md b/docs/integrations/web-servers/varnish.md index d811713cd4..2320420af8 100644 --- a/docs/integrations/web-servers/varnish.md +++ b/docs/integrations/web-servers/varnish.md @@ -56,9 +56,9 @@ This section provides instructions for configuring log and metric collection for Configuring log and metric collection for the Varnish app includes the following tasks: -### Configure Logs and Metrics Collection +### Configure logs and metrics collection -Instructions below show how to configure Kubernetes and Non-Kubernetes environments. +Instructions below show how to configure Kubernetes and non-Kubernetes environments. -The first service in the pipeline is Telegraf. Telegraf collects metrics from Varnish. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment, for example, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Varnish input plugin to obtain metrics. For simplicity, the diagram doesn’t show the input plugins.The injection of the Telegraf sidecar container is done by the Telegraf Operator. -P -rometheus pulls metrics from Telegraf and sends them to [Sumo Logic Distribution for OpenTelemetry Collector](https://github.com/SumoLogic/sumologic-otel-collector), which enriches metadata and sends metrics to Sumo Logic. +The first service in the pipeline is Telegraf. Telegraf collects metrics from Varnish. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment, for example, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Varnish input plugin to obtain metrics. For simplicity, the diagram doesn’t show the input plugins. The injection of the Telegraf sidecar container is done by the Telegraf Operator. +Prometheus pulls metrics from Telegraf and sends them to [Sumo Logic Distribution for OpenTelemetry Collector](https://github.com/SumoLogic/sumologic-otel-collector), which enriches metadata and sends metrics to Sumo Logic. In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic. :::note Prerequisites -It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). +It’s assumed that you are using the latest Helm chart version. If not, upgrade using the instructions [here](/docs/send-data/kubernetes). ::: -### Configure Metrics Collection +### Configure metrics collection This section explains the steps to collect Varnish metrics from a Kubernetes environment. @@ -111,19 +110,19 @@ This section explains the steps to collect Varnish metrics from a Kubernetes env Enter in values for the following parameters (marked `CHANGEME` in the snippet above): -* `telegraf.influxdata.com/inputs`. This contains the required configuration for the Telegraf varnish Input plugin. Please refer[ to this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/redis) for more information on configuring the Varnish input plugin for Telegraf. Note: As telegraf will be run as a sidecar, the host should always be localhost. +* `telegraf.influxdata.com/inputs`. This contains the required configuration for the Telegraf Varnish input plugin. Please refer to this [doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/redis) for more information on configuring the Varnish input plugin for Telegraf. Note: As Telegraf will be run as a sidecar, the host should always be localhost. * In the input plugins section, which is `[[inputs.varnish]]` - * `binary`. The default location of the varnish stat binary. Please override as per your configuration. + * `binary`. The default location of the Varnish stat binary. Please override as per your configuration. * `use_sudo`. If running as a restricted user, prepend sudo for additional access. * `stats`. Stats may also be set to `["*"]`, which will collect all stats. See [this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) for more information on additional parameters for configuring the Varnish input plugin for Telegraf. * In the tags section, which is `[inputs.varnish.tags]` - * `environment`. This is the deployment environment where the Varnish cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. + * `environment`. This is the deployment environment where the Varnish cluster identified by the value of servers resides. For example: dev, prod, or qa. While this value is optional, we highly recommend setting it. * `cache_cluster`. Enter a name to identify this Varnish cluster. This cluster name will be shown in the Sumo Logic dashboards. -**Do not modify** the following values set by this configuration as it will cause the Sumo Logic app to not function correctly. +**Do not modify** the following values set by this configuration, as it will cause the Sumo Logic app to not function correctly. * `telegraf.influxdata.com/class: sumologic-prometheus`. This instructs the Telegraf operator what output to use. This should not be changed. * `prometheus.io/scrape: "true"`. This ensures our Prometheus will scrape the metrics. -* `prometheus.io/port: "9273"`. This tells prometheus what ports to scrape on. This should not be changed. +* `prometheus.io/port: "9273"`. This tells Prometheus what ports to scrape on. This should not be changed. * `telegraf.influxdata.com/inputs` * In the tags section, `[inputs.varnish.tags]` * `component: “cache”`. This value is used by Sumo Logic apps to identify application components. @@ -134,24 +133,23 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data 3. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step. 4. Verify metrics in Sumo Logic. - -### Configure Logs Collection +### Configure logs collection This section explains the steps to collect Varnish logs from a Kubernetes environment. 1. **(Recommended Method) Add labels on your Varnish pods to capture logs from standard output**. Follow the instructions below to capture Varnish logs from stdout on Kubernetes. - 1. Apply following labels to the Varnish pods: + 1. Apply the following labels to the Varnish pods: ```bash environment: "prod_CHANGEME" component: "cache" cache_system: "varnish" cache_cluster: "varnish_on_k8s_CHANGEME" ``` - 2. Enter in values for the following parameters (marked `CHANGEME` in the snippet above): - * `environment`. This is the deployment environment where the Varnish cluster identified by the value of `servers` resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. + 2. Enter the values for the following parameters (marked `CHANGEME` in the snippet above): + * `environment`. This is the deployment environment where the Varnish cluster identified by the value of `servers` resides. For example: dev, prod, or qa. While this value is optional,l we highly recommend setting it. * `cache_cluster`. Enter a name to identify this Varnish cluster. This cluster name will be shown in the Sumo Logic dashboards. -**Do not modify** the following values set by this configuration as it will cause the Sumo Logic app to not function correctly. +**Do not modify** the following values set by this configuration, as it will cause the Sumo Logic app to not function correctly. * `component: “cache”`. This value is used by Sumo Logic apps to identify application components. * `cache_system: “varnish”`. This value identifies the cache system. @@ -182,18 +180,18 @@ For all other parameters, see [this doc](/docs/send-data/collect-from-other-data -We use the Telegraf operator for Varnish metric collection and Sumo Logic Installed Collector for collecting Varnish logs. The diagram below illustrates the components of the Varnish collection in a non-Kubernetes environment. +We use the Telegraf operator for Varnish metric collection and the Sumo Logic Installed Collector for collecting Varnish logs. The diagram below illustrates the components of the Varnish collection in a non-Kubernetes environment. Varnish -Telegraf runs on the same system as Varnish, and uses the [Varnish input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) to obtain Varnish metrics, and the Sumo Logic output plugin to send the metrics to Sumo Logic. Logs from Varnish on the other hand are sent to a Sumo Logic Local File source. +Telegraf runs on the same system as Varnish and uses the [Varnish input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) to obtain Varnish metrics, and the Sumo Logic output plugin to send the metrics to Sumo Logic. Logs from Varnish, on the other hand, are sent to a Sumo Logic Local File source. -### Configure Metrics Collection +### Configure metrics collection This section provides instructions for configuring metrics collection for the Sumo Logic app for Varnish. 1. **Configure a Hosted Collector**. To create a new Sumo Logic hosted collector, perform the steps in the[Create a Hosted Collector](/docs/send-data/hosted-collectors/configure-hosted-collector) section of the Sumo Logic documentation. -2. **Configure an HTTP Logs and Metrics Source**. Create a new HTTP Logs and Metrics Source in the hosted collector created above by following[ these instructions. ](/docs/send-data/hosted-collectors/http-source/logs-metrics)Make a note of the **HTTP Source URL**. +2. **Configure an HTTP Logs and Metrics Source**. Create a new HTTP Logs and Metrics Source in the hosted collector created above by following the instructions given in this [document](/docs/send-data/hosted-collectors/http-source/logs-metrics). Make a note of the **HTTP Source URL**. 3. **Install Telegraf**. Use the[ following steps](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf.md) to install Telegraf. 4. **Configure and start Telegraf**. As part of collecting metrics data from Telegraf, we will use the [Varnish input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) to get data from Telegraf and the [Sumo Logic output plugin](https://github.com/SumoLogic/fluentd-output-sumologic) to send data to Sumo Logic. @@ -215,18 +213,18 @@ Create or modify telegraf.conf and copy and paste the text below: Please enter values for the following parameters (marked `CHANGEME` above): * In the input plugins section, which is `[[inputs.varnish]]` - * **`binary`** - The default location of the varnish stat binary. Please override as per your configuration. + * **`binary`** - The default location of the Varnish stat binary. Please override as per your configuration. * `use_sudo` - If running as a restricted user, prepend sudo for additional access. * **`stats`** - Stats may also be set to ["*"], which will collect all stats. Please see [this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish) for more information on additional parameters for configuring the Varnish input plugin for Telegraf. * In the tags section, which is `[inputs.varnish.tags]` - * `environment`. This is the deployment environment where the Varnish cluster identified by the value of **servers** resides. For example; dev, prod or qa. While this value is optional we highly recommend setting it. + * `environment`. This is the deployment environment where the Varnish cluster identified by the value of **servers** resides. For example, dev, prod, or qa. While this value is optional, we highly recommend setting it. * `cache_cluster`. Enter a name to identify this Varnish cluster. This cluster name will be shown in the Sumo Logic dashboards. * In the output plugins section, which is `[[outputs.sumologic]]` * `url` - This is the HTTP source URL created in step 3. Please see [this doc](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/configure-telegraf-output-plugin) for more information on additional parameters for configuring the Sumo Logic Telegraf output plugin. -**Do not modify** the following values set by this Telegraf configuration as it will cause the Sumo Logic app to not function correctly. +**Do not modify** the following values set by this Telegraf configuration, as it will cause the Sumo Logic app to not function correctly. * `data_format=“prometheus”`. In the output plugins section, which is `[[outputs.sumologic]]`. Metrics are sent in the Prometheus format to Sumo Logic -* `Component=“cache”`. In the input plugins section, which is, `[[inputs.varnish]]` - This value is used by Sumo Logic apps to identify application components. +* `Component=“cache”`. In the input plugins section, which is `[[inputs.varnish]]` - This value is used by Sumo Logic apps to identify application components. * `cache_system: “varnish”`. In the input plugins sections. In other words, this value identifies the cache system * For all other parameters, see [this doc](https://github.com/influxdata/telegraf/blob/master/etc/logrotate.d/telegraf) for more parameters that can be configured in the Telegraf agent globally. @@ -234,33 +232,31 @@ Once you have finalized your `telegraf.conf` file, you can start or reload the t At this point, Varnish metrics should start flowing into Sumo Logic. - -### Configure Logs Collection +### Configure logs collection This section provides instructions for configuring log collection for Varnish running on a non-Kubernetes environment for the Sumo Logic app for Varnish. -By default, Varnish logs are stored in a log file. Sumo Logic supports collecting logs via a local log file. Local log files can be collected via [Installed collectors](/docs/send-data/installed-collectors). An Installed collector will require you to allow outbound traffic to [Sumo Logic endpoints](/docs/api/getting-started#sumo-logic-endpoints-by-deployment-and-firewall-security) for collection to work. For detailed requirements for Installed collectors, see this [page](/docs/get-started/system-requirements#installed-collector-requirements). +By default, Varnish logs are stored in a log file. Sumo Logic supports collecting logs via a local log file. Local log files can be collected via [Installed collectors](/docs/send-data/installed-collectors). An installed collector will require you to allow outbound traffic to [Sumo Logic endpoints](/docs/api/getting-started#sumo-logic-endpoints-by-deployment-and-firewall-security) for collection to work. For detailed requirements for installed collectors, see this [page](/docs/get-started/system-requirements#installed-collector-requirements). - -1. **Configure logging in Varnish**. Varnish supports logging via the following methods: local text log files. For details please visit this [page](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/). For the dashboards to work properly, please set the below specified log format as explained [here](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#step-3-customise-options-1): +1. **Configure logging in Varnish**. Varnish supports logging via the following methods: local text log files. For details, please visit this [page](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/). For the dashboards to work properly, please set the below-specified log format as explained [here](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#step-3-customise-options-1): ```bash %h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" ``` -2. **Configure Varnish to log to a Local file**. By default, any installation of varnishd will not write any request logs to disk. Instead, Varnish has an in-memory log, and supplies tools to tap into this log and write to disk. To configure logging to a local file, follow the steps on [this](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#enable-varnishncsa-logging) page. By default, Varnish logs are stored in **/var/log/varnish/varnishncsa.log**. For customized options please visit this [page](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#step-3-customise-options-1). Logs from the Varnish log file can be collected via a Sumo Logic [Installed collector](/docs/send-data/installed-collectors) and a [Local File Source](/docs/send-data/installed-collectors/sources/local-file-source) as explained in the next section. -3. **Configuring a Collector**. To add an Installed collector, perform the steps as defined on the page[ Configure an Installed Collector.](/docs/send-data/installed-collectors) -4. **Configuring a Source**: To add a Local File Source source for Varnish do the following. To collect logs directly from your Varnish machine, use an Installed Collector and a Local File Source. +2. **Configure Varnish to log to a Local file**. By default, any installation of varnishd will not write any request logs to disk. Instead, Varnish has an in-memory log and supplies tools to tap into this log and write to disk. To configure logging to a local file, follow the steps on [this](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#enable-varnishncsa-logging) page. By default, Varnish logs are stored in **/var/log/varnish/varnishncsa.log**. For customized options, please visit this [page](https://docs.varnish-software.com/tutorials/enabling-logging-with-varnishncsa/#step-3-customise-options-1). Logs from the Varnish log file can be collected via a Sumo Logic [Installed collector](/docs/send-data/installed-collectors) and a [Local File Source](/docs/send-data/installed-collectors/sources/local-file-source) as explained in the next section. +3. **Configuring a Collector**. To add an installed collector, perform the steps given in the [Configure an Installed Collector](/docs/send-data/installed-collectors) document. +4. **Configuring a Source**: To add a Local File Source for Varnish, do the following. To collect logs directly from your Varnish machine, use an Installed Collector and a Local File Source. 1. Add a [Local File Source](/docs/send-data/installed-collectors/sources/local-file-source). 2. Configure the Local File Source fields as follows: * **Name.** (Required) * **Description.** (Optional) * **File Path (Required).** Enter the path to your error.log or access.log. The files are typically located in **/var/log/varnish/varnishncsa.log**. - * **Source Host.** Sumo Logic uses the hostname assigned by the OS unless you enter a different host name - * **Source Category.** Enter any string to tag the output collected from this Source, such as **Varnish/Logs**. (The Source Category metadata field is a fundamental building block to organize and label Sources. For details, see[ Best Practices](/docs/send-data/best-practices.md).) + * **Source Host.** Sumo Logic uses the hostname assigned by the OS unless you enter a different hostname. + * **Source Category.** Enter any string to tag the output collected from this Source, such as **Varnish/Logs**. (The Source Category metadata field is a fundamental building block to organize and label Sources. For details, see [Best Practices](/docs/send-data/best-practices.md).) * **Fields.** Set the following fields: * `component = cache` * `cache_system = varnish` * `cache_cluster = ` - * `environment = `, such as Dev, QA or Prod. + * `environment = `, such as Dev, QA, or Prod. 3. Configure the **Advanced** section: * **Enable Timestamp Parsing.** Select Extract timestamp information from log file entries. * **Time Zone.** Choose the option, **Ignore time zone from log file and instead use**, and then select your Varnish Server’s time zone. @@ -275,7 +271,6 @@ At this point, Varnish logs should start flowing into Sumo Logic. - ## Installing the Varnish app import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; @@ -295,7 +290,6 @@ Additionally, if you're using Varnish in the Kubernetes environment, the followi * `pod_labels_cache_system` * `pod_labels_cache_cluster` - ## Viewing Varnish dashboards import ViewDashboards from '../../reuse/apps/view-dashboards.md'; @@ -304,14 +298,14 @@ import ViewDashboards from '../../reuse/apps/view-dashboards.md'; ### Overview -The **Varnish (Classic) - Overview** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on visitor geographic locations, traffic volume and distribution, responses over time, and time comparisons for visitor locations and uptime, cache hit, requests, VLC. +The **Varnish (Classic) - Overview** dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on visitor geographic locations, traffic volume and distribution, responses over time, and time comparisons for visitor locations and uptime, cache hit, requests, and VLC. Use this dashboard to: * Analyze Request backend, frontend, VLCs, Pool, Thread, VMODs, and cache hit rate. -* Analyze HTTP request about status code +* Analyze the HTTP request for the status code. * Gain insights into Network traffic for your Varnish server. -* Gain insights into originated traffic location by region. This can help you allocate computer resources to different regions according to their needs. -* Gain insights into Client, Server Responses on Varnish Server. This helps you identify errors in Varnish Server. +* Gain insights into the origin of traffic location by region. This can help you allocate computer resources to different regions according to their needs. +* Gain insights into Client, Server Responses on Varnish Server. This helps you identify errors in the Varnish Server. Varnish dashboard @@ -329,7 +323,7 @@ Use this dashboard to: ### Web Server Operations -The **Varnish (Classic) - Web Server Operations** dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations, and data for clients with high error rates, server errors over time, and non 200 response code status codes. Dashboard panels also show server error logs, error log levels, error responses by the server, and the top URLs responsible for 404 responses. +The **Varnish (Classic) - Web Server Operations** dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations, and data for clients with high error rates, server errors over time, and non-200 response code status codes. Dashboard panels also show server error logs, error log levels, error responses by the server, and the top URLs responsible for 404 responses. Use this dashboard to: * Determine failures in responding. @@ -344,7 +338,7 @@ The **Varnish (Classic) - Traffic Timeline Analysis** dashboard provides a high- Use this dashboard to: * To understand the traffic distribution across servers, provide insights for resource planning by analyzing data volume and bytes served. -* Gain insights into originated traffic location by region. This can help you allocate compute resources to different regions according to their needs. +* Gain insights into the origin of traffic location by region. This can help you allocate compute resources to different regions according to their needs. Varnish dashboard @@ -360,7 +354,7 @@ Use this dashboard to: ### Threat Intel -The **Varnish (Classic) - Threat Intel** dashboard provides an at-a-glance view of threats to Varnish servers on your network. Dashboard panels display threats count over a selected time period, geographic locations where threats occurred, source breakdown, actors responsible for threats, severity, and a correlation of IP addresses, method, and status code of threats. +The **Varnish (Classic) - Threat Intel** dashboard provides an at-a-glance view of threats to Varnish servers on your network. Dashboard panels display threat count over a selected time period, geographic locations where threats occurred, source breakdown, actors responsible for threats, severity, and a correlation of IP addresses, method, and status code of threats. Use this dashboard to: * To gain insights and understand threats in incoming traffic and discover potential IOCs. @@ -379,17 +373,16 @@ Use this dashboard to: ### Bans and Bans Lurker -The **Varnish (Classic) - Bans and Bans Lurker** dashboard provides you the list of Bans filters applied to keep Varnish from serving stale content. +The **Varnish (Classic) - Bans and Bans Lurker** dashboard provides you with the list of Bans filters applied to keep Varnish from serving stale content. Use this dashboard to: * Gain insights into bans and make sure that Varnish is serving the latest content. Varnish dashboard - ### Cache Performance -The **Varnish (Classic) - Cache Performance** dashboard provides worker thread related metrics to tell you if your thread pools are healthy and functioning well. +The **Varnish (Classic) - Cache Performance** dashboard provides worker thread-related metrics to tell you if your thread pools are healthy and functioning well. Use this dashboard to: * Gain insights into the performance and health of Varnish Cache. @@ -409,21 +402,20 @@ Use this dashboard to: ### Threads -The **Varnish (Classic) - Threads** Dashboard helps you to keep track of threads metrics to watch your Varnish Cache. +The **Varnish (Classic) - Threads** Dashboard helps you to keep track of thread metrics to watch your Varnish Cache. Use this dashboard to: * Manage and understand threads in the Varnish system. Varnish dashboard - ## Create monitors for Varnish app import CreateMonitors from '../../reuse/apps/create-monitors.md'; -### Varnish Alerts +### Varnish alerts | Alert Type (Metrics/Logs) | Alert Name | Alert Description | Trigger Type (Critical / Warning) | Alert Condition | Recover Condition | |:---|:---|:---|:---|:---|:---| From 87a12b38eda80d2ed741d844b2873a92ddce8403 Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 11:00:34 +0530 Subject: [PATCH 7/8] Update strimzi-kafka.md --- .../containers-orchestration/strimzi-kafka.md | 33 +++++-------------- 1 file changed, 9 insertions(+), 24 deletions(-) diff --git a/docs/integrations/containers-orchestration/strimzi-kafka.md b/docs/integrations/containers-orchestration/strimzi-kafka.md index 848adb3659..c43128ab31 100644 --- a/docs/integrations/containers-orchestration/strimzi-kafka.md +++ b/docs/integrations/containers-orchestration/strimzi-kafka.md @@ -21,7 +21,6 @@ This App has been tested with the following Kafka Operator versions: This App has been tested with the following Kafka versions: * 3.4.0 - ## Sample log messages ```json @@ -48,14 +47,13 @@ messaging_cluster=* messaging_system="kafka" \ The list of metrics collected can be found [here](/docs/integrations/containers-orchestration/kafka/#kafka-metrics). - -## Collecting logs and metrics for Strimzi Kafka Pods +## Collecting logs and metrics for Strimzi Kafka pods Collection architecture is similar to Kafka and described [here](/docs/integrations/containers-orchestration/strimzi-kafka/#collecting-logs-and-metrics-for-strimzi-kafka-pods). This section provides instructions for configuring log and metric collection for the Sumo Logic App for Strimzi Kafka. -### Prerequisites for Kafka Cluster Deployment +### Prerequisites for Kafka cluster deployment Before configuring the collection, you will require the following items: @@ -65,7 +63,7 @@ Before configuring the collection, you will require the following items: 3. Download the [kafka-metrics-sumologic-telegraf.yaml](https://drive.google.com/file/d/1pvMqYiJu7_nEv2F2RsPKIn_WWs8BKcxQ/view?usp=sharing). If you already have an existing yaml, you will have to merge the contents of both files. This file contains the Kafka resource. -### Deploying Sumo Logic Kubernetes Collection +### Deploying Sumo Logic Kubernetes collection 1. Create a new namespace to deploy resources. The below command creates a **sumologiccollection** namespace. @@ -92,7 +90,7 @@ Before configuring the collection, you will require the following items: A collector will be created in your Sumo Logic org with the cluster name provided in the above command. You can verify it by referring to the [collection page](/docs/send-data/collection/). -### Configure Metrics Collection +### Configure metrics collection Follow these steps to collect metrics from a Kubernetes environment: @@ -143,8 +141,7 @@ Follow these steps to collect metrics from a Kubernetes environment: For more information on configuring the Joloka input plugin for Telegraf, see [this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/jolokia2). - -### Configure Logs Collection +### Configure logs collection If your Kafka helm chart/pod is writing the logs to standard output, then the [Sumologic Kubernetes Collection](/docs/integrations/containers-orchestration/kubernetes/#collecting-metrics-and-logs-for-the-kubernetes-app) will automatically capture the logs from stdout and will send the logs to Sumologic. If no, then you have to use [tailing-sidecar](https://github.com/SumoLogic/tailing-sidecar/blob/main/README.md) approach. 1. **Add labels on your Kafka pods** @@ -177,7 +174,7 @@ If your Kafka helm chart/pod is writing the logs to standard output, then the [S Sumo Logic FER -### Deploying Kafka Pods +### Deploying Kafka pods After updating **kafka-metrics-sumologic-telegraf.yaml**, you can use the below command to deploy the Kafka pods @@ -185,8 +182,7 @@ If your Kafka helm chart/pod is writing the logs to standard output, then the [S kubectl apply -f kafka-metrics-sumologic-telegraf.yaml -n <> ``` - -### Deployment Verification +### Deployment verification 1. Make sure that the Kafka pods are running and correct annotations/labels are applied by using the command: ```bash @@ -233,8 +229,7 @@ If your Kafka helm chart/pod is writing the logs to standard output, then the [S values: ["sumologiccollection"] ``` - -## Installing the Kafka App +## Installing the Kafka app import AppInstall2 from '../../reuse/apps/app-install-sc-k8s.md'; @@ -253,7 +248,7 @@ Additionally, if you're using Squid Proxy in the Kubernetes environment, the fol * `pod_labels_messaging_system` * `pod_labels_messaging_cluster` -## Viewing the Kafka Dashboards +## Viewing the Kafka dashboards import ViewDashboards from '../../reuse/apps/view-dashboards.md'; @@ -270,7 +265,6 @@ Use this dashboard to: Kafka dashboards - ### Strimzi Kafka - Outlier Analysis The **Strimzi Kafka - Outlier Analysis** dashboard helps you identify outliers for key metrics across your Kafka clusters. @@ -280,7 +274,6 @@ Use this dashboard to: Kafka dashboards - ### Strimzi Kafka - Replication The Strimzi Kafka - Replication dashboard helps you understand the state of replicas in your Kafka clusters. @@ -296,7 +289,6 @@ Use this dashboard to monitor the following key metrics: Kafka dashboards - ### Strimzi Kafka - Zookeeper The **Strimzi Kafka -Zookeeper** dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput, and network across Kafka brokers and clusters. @@ -323,7 +315,6 @@ Use this dashboard to: Kafka dashboards - ### Strimzi Kafka - Failures and Delayed Operations The **Strimzi Kafka - Failures and Delayed Operations** dashboard gives you insight into all failures and delayed operations associated with your Kafka clusters. @@ -340,7 +331,6 @@ Use this dashboard to: Kafka dashboards - ### Strimzi Kafka - Request-Response Times The **Strimzi Kafka - Request-Response** **Times** dashboard helps you get insight into key request and response latencies of your Kafka cluster. @@ -397,7 +387,6 @@ Use this dashboard to: Kafka dashboards - ### Kafka Broker - Disk Usage The **Kafka Broker - Disk Usage** dashboard helps monitor disk usage across your Kafka Brokers. @@ -421,7 +410,6 @@ Use this dashboard to: Kafka dashboards - ### Kafka Broker - Threads The **Kafka Broker - Threads** dashboard shows the key insights into the usage and type of threads created in your Kafka broker JVM @@ -443,7 +431,6 @@ Use this dashboard to: Kafka dashboards - ### Strimzi Kafka - Topic Overview The Strimzi Kafka - Topic Overview dashboard helps you quickly identify under-replicated partitions and incoming bytes by Kafka topic, server, and cluster. @@ -459,8 +446,6 @@ Use this dashboard to: Kafka dashboards - - ### Strimzi Kafka - Topic Details The Strimzi Kafka - Topic Details dashboard gives you insight into throughput, partition sizes, and offsets across Kafka brokers, topics, and clusters. From 8770014ffe9ccc4d3e0145a10119f1494d066d17 Mon Sep 17 00:00:00 2001 From: Amee Lepcha Date: Fri, 9 May 2025 11:03:14 +0530 Subject: [PATCH 8/8] Update iis-7.md --- docs/integrations/microsoft-azure/iis-7.md | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/docs/integrations/microsoft-azure/iis-7.md b/docs/integrations/microsoft-azure/iis-7.md index 2eeea8543b..81cb56cee9 100644 --- a/docs/integrations/microsoft-azure/iis-7.md +++ b/docs/integrations/microsoft-azure/iis-7.md @@ -11,7 +11,6 @@ import useBaseUrl from '@docusaurus/useBaseUrl'; The IIS 7 App monitors the performance and reliability of your Microsoft Internet Information Services (IIS) infrastructure, identifying customer-facing and internal operational issues. Additionally, you can monitor customer paths and interactions to learn how customers are using your product. The app consists of predefined searches and Dashboards, which provide visibility into your environment for real-time or historical analysis. - ## Log types IIS 7 Logs (IIS 7.5 logs are used) are generated as local files and written to this directory by default: `C:\inetpub\Logs\LogFiles\W3SVC1`. The App assumes the following format: @@ -66,7 +65,6 @@ _sourceCategory=IIS* | transpose row _timeslice column app ``` - The following query is taken from the **OSes and Browsers** panel of the **IIS 7 Traffic Insights - Content and Client Platform Dashboard**. ```sql title="Operating Systems (OSes) and Browsers" @@ -93,12 +91,10 @@ if (agent matches "Dolphin*","Dolphin", Browser) as Browser | transpose row os column browser as * ``` - ## Collecting logs for IIS 7 This procedure explains how to enable logging from Microsoft Internet Information Services (IIS) on your Windows server and ingest the logs into Sumo Logic. - ### Prerequisites To prepare for logging IIS 7 events, perform the following two tasks. @@ -137,12 +133,10 @@ To confirm that the log files are being created, do the following: 1. Open a command-line window and change directories to `C:\inetpub\Logs\LogFiles`. This is the same path you will enter when you configure the Source to collect these files. 2. Under the `\W3SVC1` directory, you should see one or more files with a `.log` extension. If the file is present, you can collect it. - ### Step 1: Configure a Collector Configure an [Installed Collector (Windows)](/docs/send-data/installed-collectors/windows). Sumo Logic recommends that you install the collector on the same system that hosts the logs. - ### Step 2: Configure a Source To collect logs from IIS 7, use an Installed Collector and a Local File Source. You may also configure a [Remote File Source](/docs/send-data/installed-collectors/sources/remote-file-source), but the configuration is more complex. Sumo Logic recommends using a Local File Source if possible. @@ -166,14 +160,13 @@ To collect logs from IIS 7, use an Installed Collector and a Local File Source. After a few minutes, your new Source should be propagated down to the Collector and will begin submitting your IIS log files to the Sumo Logic service. - ## Field Extraction Rules
**FER to normalize the fields**. Field Extraction Rule named **AppObservabilityIIS7FER** is automatically created for IIS 7/8 Application Components.
-## Installing the IIS 7 App +## Installing the IIS 7 app import AppInstall from '../../reuse/apps/app-install-v2.md'; @@ -193,7 +186,7 @@ As part of the app installation process, the following fields will be created by * `cs_uri_stem` * `cs_username` -## Viewing IIS 7 Dashboards +## Viewing IIS 7 dashboards import ViewDashboards from '../../reuse/apps/view-dashboards.md';