diff --git a/blog-service/2025-08-26-apps.md b/blog-service/2025-08-26-apps.md new file mode 100644 index 0000000000..b0d69defa4 --- /dev/null +++ b/blog-service/2025-08-26-apps.md @@ -0,0 +1,14 @@ +--- +title: OpenTelemetry Collector Insights (Apps) +image: https://help.sumologic.com/img/reuse/rss-image.jpg +keywords: + - apps + - sumo-logic + - opentelemetry-collector-insights +hide_table_of_contents: true +--- + +import useBaseUrl from '@docusaurus/useBaseUrl'; + + +We're excited to introduce the new OpenTelemetry Collector Insights app for Sumo Logic. This app offers robust monitoring and observability for Sumo Logic OpenTelemetry Collector instances (version 0.130.1-sumo-0 and above), enabling you to track performance, data flow, and resource usage through prebuilt dashboards and alerts. [Learn more](/docs/integrations/sumo-apps/opentelemetry-collector-insights/). \ No newline at end of file diff --git a/cid-redirects.json b/cid-redirects.json index 283b8910d6..37ffd291f9 100644 --- a/cid-redirects.json +++ b/cid-redirects.json @@ -2926,6 +2926,7 @@ "/cid/10999": "/docs/send-data/collect-from-other-data-sources/azure-monitoring/ms-azure-event-hubs-source", "/cid/11000": "/docs/platform-services/automation-service/automation-service-playbooks", "/cid/1105": "/docs/integrations/cloud-security-monitoring-analytics/aws-security-hub-ocsf", + "/cid/1106": "/docs/integrations/sumo-apps/opentelemetry-collector-insights", "/Cloud_SIEM_Enterprise": "/docs/cse", "/Cloud_SIEM_Enterprise/Administration": "/docs/cse/administration", "/Cloud_SIEM_Enterprise/Administration/Cloud_SIEM_Enterprise_Feature_Update_(2022)": "/docs/cse/administration", diff --git a/docs/integrations/hosts-operating-systems/opentelemetry/index.md b/docs/integrations/hosts-operating-systems/opentelemetry/index.md index 16dc657072..637e4dad9d 100644 --- a/docs/integrations/hosts-operating-systems/opentelemetry/index.md +++ b/docs/integrations/hosts-operating-systems/opentelemetry/index.md @@ -1,10 +1,10 @@ --- slug: /integrations/hosts-operating-systems/opentelemetry title: OpenTelemetry -description: Learn about our Sumo Logic OpenTelemetry apps that you can use to monitor host metrics and Linux. +description: Learn about our Sumo Logic OpenTelemetry apps for monitoring hosts, operating systems, and OpenTelemetry Collector infrastructure. --- -This guide has documentation for Sumo Logic OpenTelemetry apps. +This guide has documentation for Sumo Logic OpenTelemetry apps for hosts and operating systems monitoring. import DocCardList from '@theme/DocCardList'; import {useCurrentSidebarCategory} from '@docusaurus/theme-common'; diff --git a/docs/integrations/product-list/product-list-m-z.md b/docs/integrations/product-list/product-list-m-z.md index 8383a746a7..45519c3130 100644 --- a/docs/integrations/product-list/product-list-m-z.md +++ b/docs/integrations/product-list/product-list-m-z.md @@ -163,7 +163,7 @@ For descriptions of the different types of integrations Sumo Logic offers, see [ | Thumbnail icon | [Strimzi](https://strimzi.io/) | App: [Strimzi Kafka](/docs/integrations/containers-orchestration/strimzi-kafka/) | | Thumbnail icon | [Stripe](https://stripe.com/) | Webhook: [Stripe](/docs/integrations/webhooks/stripe/) | | Thumbnail icon | [Sucuri](https://sucuri.net/) | Cloud SIEM integration: [Sucuri](https://github.com/SumoLogic/cloud-siem-content-catalog/blob/master/vendors/cdfd2ba0-77eb-4e11-b071-6f4d01fda607.md) | -| Thumbnail icon | [Sumo Logic](https://www.sumologic.com/) | Apps:
- [Enterprise Audit - Cloud SIEM](/docs/integrations/sumo-apps/cse/)
- [Flex](/docs/integrations/sumo-apps/flex/)
- [Sumo Collection](/docs/integrations/saas-cloud/sumo-collection)
- [Sumo Logic Audit](/docs/integrations/sumo-apps/audit/)
- [Sumo Logic Data Volume](/docs/integrations/sumo-apps/data-volume/)
- [Sumo Logic Enterprise Audit](/docs/integrations/sumo-apps/enterprise-audit/) (multiple apps)
- [Sumo Logic Enterprise Search Audit](/docs/integrations/sumo-apps/enterprise-search-audit/)
- [Sumo Logic Infrequent Data Tier](/docs/integrations/sumo-apps/infrequent-data-tier/)
- [Sumo Logic Kickstart Data](/docs/integrations/sumo-apps/kickstart-data)
- [Sumo Logic Log Analysis QuickStart](/docs/integrations/sumo-apps/log-analysis-quickstart/)
- [Sumo Logic Security Analytics](/docs/integrations/sumo-apps/security-analytics/)
Automation integrations:
- [Automation Tools](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-automation-tools/)
- [Basic Tools](/docs/platform-services/automation-service/app-central/integrations/basic-tools/)
- [ESMTP](/docs/platform-services/automation-service/app-central/integrations/esmtp/)
- [HTTP Tools](/docs/platform-services/automation-service/app-central/integrations/http-tools/)
- [Incident Tools](/docs/platform-services/automation-service/app-central/integrations/incident-tools/)
- [IMAP](/docs/platform-services/automation-service/app-central/integrations/imap/)
- [Mail Tools](/docs/platform-services/automation-service/app-central/integrations/mail-tools/)
- [POP3](/docs/platform-services/automation-service/app-central/integrations/pop3/)
- [SMTP V3](/docs/platform-services/automation-service/app-central/integrations/smtp-v3/)
- [Sumo Logic Cloud SIEM](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-cloud-siem/)
- [Sumo Logic Cloud SIEM Internal](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-cloud-siem-internal/)
- [Sumo Logic Log Analytics](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-log-analytics/)
- [Sumo Logic Log Analytics Internal](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-log-analytics-internal/)
- [Sumo Logic Notifications](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications/)
- [Sumo Logic Notifications by Gmail](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications-by-gmail/)
- [Sumo Logic Notifications by Microsoft](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications-by-microsoft)
- [Triage Tools](/docs/platform-services/automation-service/app-central/integrations/triage-tools/)
- [ZIP Tools](/docs/platform-services/automation-service/app-central/integrations/zip-tools/)
Cloud SIEM integration: [Sumo Logic](https://github.com/SumoLogic/cloud-siem-content-catalog/blob/master/vendors/34A5019C-7BEC-4BF8-A3B7-C38D567126C6.md)
Collector:
- [Sumo Collection](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/sumo-collection-source)
- [Universal Connector](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/universal-connector-source)
Community app: [Cloud Security Posture Management (CSPM) for Sumo Logic](https://github.com/SumoLogic/sumologic-content/tree/master/CSPM)
Webhooks:
- [Scheduled Searches for Webhook Connections](/docs/alerts/webhook-connections/schedule-searches-webhook-connections/)
- [Using the Audit Index with Webhook Connections](/docs/alerts/webhook-connections/audit-index/)
- [Webhook Connection for Cloud SOAR](/docs/alerts/webhook-connections/cloud-soar/) | +| Thumbnail icon | [Sumo Logic](https://www.sumologic.com/) | Apps:
- [Enterprise Audit - Cloud SIEM](/docs/integrations/sumo-apps/cse/)
- [Flex](/docs/integrations/sumo-apps/flex/)
- [Sumo Collection](/docs/integrations/saas-cloud/sumo-collection)
- [Sumo Logic Audit](/docs/integrations/sumo-apps/audit/)
- [Sumo Logic Data Volume](/docs/integrations/sumo-apps/data-volume/)
- [Sumo Logic Enterprise Audit](/docs/integrations/sumo-apps/enterprise-audit/) (multiple apps)
- [Sumo Logic Enterprise Search Audit](/docs/integrations/sumo-apps/enterprise-search-audit/)
- [Sumo Logic Infrequent Data Tier](/docs/integrations/sumo-apps/infrequent-data-tier/)
- [Sumo Logic Kickstart Data](/docs/integrations/sumo-apps/kickstart-data)
- [Sumo Logic Log Analysis QuickStart](/docs/integrations/sumo-apps/log-analysis-quickstart/)
- [Sumo Logic OpenTelemetry Collector Insights](/docs/integrations/sumo-apps/opentelemetry-collector-insights/)
- [Sumo Logic Security Analytics](/docs/integrations/sumo-apps/security-analytics/)
Automation integrations:
- [Automation Tools](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-automation-tools/)
- [Basic Tools](/docs/platform-services/automation-service/app-central/integrations/basic-tools/)
- [ESMTP](/docs/platform-services/automation-service/app-central/integrations/esmtp/)
- [HTTP Tools](/docs/platform-services/automation-service/app-central/integrations/http-tools/)
- [Incident Tools](/docs/platform-services/automation-service/app-central/integrations/incident-tools/)
- [IMAP](/docs/platform-services/automation-service/app-central/integrations/imap/)
- [Mail Tools](/docs/platform-services/automation-service/app-central/integrations/mail-tools/)
- [POP3](/docs/platform-services/automation-service/app-central/integrations/pop3/)
- [SMTP V3](/docs/platform-services/automation-service/app-central/integrations/smtp-v3/)
- [Sumo Logic Cloud SIEM](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-cloud-siem/)
- [Sumo Logic Cloud SIEM Internal](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-cloud-siem-internal/)
- [Sumo Logic Log Analytics](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-log-analytics/)
- [Sumo Logic Log Analytics Internal](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-log-analytics-internal/)
- [Sumo Logic Notifications](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications/)
- [Sumo Logic Notifications by Gmail](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications-by-gmail/)
- [Sumo Logic Notifications by Microsoft](/docs/platform-services/automation-service/app-central/integrations/sumo-logic-notifications-by-microsoft)
- [Triage Tools](/docs/platform-services/automation-service/app-central/integrations/triage-tools/)
- [ZIP Tools](/docs/platform-services/automation-service/app-central/integrations/zip-tools/)
Cloud SIEM integration: [Sumo Logic](https://github.com/SumoLogic/cloud-siem-content-catalog/blob/master/vendors/34A5019C-7BEC-4BF8-A3B7-C38D567126C6.md)
Collector:
- [Sumo Collection](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/sumo-collection-source)
- [Universal Connector](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/universal-connector-source)
Community app: [Cloud Security Posture Management (CSPM) for Sumo Logic](https://github.com/SumoLogic/sumologic-content/tree/master/CSPM)
Webhooks:
- [Scheduled Searches for Webhook Connections](/docs/alerts/webhook-connections/schedule-searches-webhook-connections/)
- [Using the Audit Index with Webhook Connections](/docs/alerts/webhook-connections/audit-index/)
- [Webhook Connection for Cloud SOAR](/docs/alerts/webhook-connections/cloud-soar/) | | Thumbnail icon | [Superwise](https://superwise.ai/) | Webhook: [Superwise](/docs/integrations/webhooks/superwise/) | | Thumbnail icon | [Symantec](https://sep.securitycloud.symantec.com/v2/landing) | App:
- [Symantec Endpoint Security Service](/docs/integrations/saas-cloud/symantec-endpoint-security-service/)
- [Symantec Web Security Service](/docs/integrations/saas-cloud/symantec-web-security-service/)
Automation integrations:
- [Javelin AD Protect](/docs/platform-services/automation-service/app-central/integrations/javelin-ad-protect/)
- [Symantec DeepSight](/docs/platform-services/automation-service/app-central/integrations/symantec-deepsight/)
- [Symantec EDR](/docs/platform-services/automation-service/app-central/integrations/symantec-edr/)
- [Symantec Endpoint Protection](/docs/platform-services/automation-service/app-central/integrations/symantec-endpoint-protection/)
- [Symantec Endpoint Protection Cloud](/docs/platform-services/automation-service/app-central/integrations/symantec-endpoint-protection-cloud/)
- [Symantec Secure Web Gateway (Bluecoat)](/docs/platform-services/automation-service/app-central/integrations/symantec-secure-web-gateway-bluecoat/)
- [Symantec WebPulse](/docs/platform-services/automation-service/app-central/integrations/symantec-webpulse/)
Collectors:
- [Symantec Endpoint Security Source](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/symantec-endpoint-security-source/)
- [Symantec Web Security Service Source](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/symantec-web-security-service-source/)
Cloud SIEM integration: [Symantec](https://github.com/SumoLogic/cloud-siem-content-catalog/blob/master/vendors/64c7f49c-f95a-4f4a-8540-56ec5fb1d96b.md)
Community app: [Sumo Logic for Symantec WSS](https://github.com/SumoLogic/sumologic-content/tree/master/Symantec/WSS) | | Thumbnail icon | [Sysdig](https://sysdig.com/) | App: [Sysdig Secure](/docs/integrations/saas-cloud/sysdig-secure/)
Cloud SIEM integration: [Sysdig](https://github.com/SumoLogic/cloud-siem-content-catalog/blob/master/vendors/c4de0854-e718-45e1-a4c8-63623755aa43.md)
Collector: [Sysdig Secure](/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/sysdig-secure-source.md) | diff --git a/docs/integrations/sumo-apps/index.md b/docs/integrations/sumo-apps/index.md index d18fd140e6..01d508da59 100644 --- a/docs/integrations/sumo-apps/index.md +++ b/docs/integrations/sumo-apps/index.md @@ -24,6 +24,8 @@ You may upgrade your account at any time. In these instances, an admin can reque Once a request has been submitted, a support ticket is automatically opened. A representative from Sumo Logic will respond to your request as soon as possible, generally between one and two business days. Depending on the app that's been requested, Sumo Logic may need additional information, or may need to work with your organization to change the account type to enable some apps. +## Guides +
@@ -79,6 +81,12 @@ Once a request has been submitted, a support ticket is automatically opened. A r

A guide to the Sumo Logic Log Analysis QuickStart app.

+
+
+ Thumbnail icon

OpenTelemetry Collector Insights

+

A guide to the Sumo Logic OpenTelemetry Collector Insights app.

+
+
Thumbnail icon

Security Analytics

diff --git a/docs/integrations/sumo-apps/opentelemetry-collector-insights.md b/docs/integrations/sumo-apps/opentelemetry-collector-insights.md new file mode 100644 index 0000000000..2810239e12 --- /dev/null +++ b/docs/integrations/sumo-apps/opentelemetry-collector-insights.md @@ -0,0 +1,422 @@ +--- +id: opentelemetry-collector-insights +title: OpenTelemetry Collector Insights +sidebar_label: OpenTelemetry Collector Insights +description: Learn about the Sumo Logic OpenTelemetry Collector Insights app. +--- + +import useBaseUrl from '@docusaurus/useBaseUrl'; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +Thumbnail icon + +The Sumo Logic OpenTelemetry Collector Insights app provides comprehensive monitoring and observability for your OpenTelemetry Collector instances. Monitor collector performance, telemetry data flow, resource utilization, and troubleshoot data collection issues with preconfigured dashboards and alerts. Track metrics and logs to ensure your telemetry pipeline is running smoothly and efficiently. + +This app supports OpenTelemetry Collector version `0.130.1-sumo-0` and later versions. + +We use the OpenTelemetry collector's built-in internal telemetry capabilities to collect metrics and logs about the collector itself. By default, the Collector exposes its own telemetry through internal metrics (via Prometheus interface on port 8888) and logs (emitted to stderr). The collector can also be configured to export its own telemetry data (metrics and logs) to Sumo Logic through OTLP/HTTP endpoints. + +## Fields creation in Sumo Logic for OpenTelemetry Collector Insights + +Following are the [fields](/docs/manage/fields/) which will be created as part of OpenTelemetry Collector Insights app installation, if not already present. + +- `sumo.datasource`. Has fixed value of `otel_collector`. +- `_contentType`. Has fixed value of `OpenTelemetry`. +- `deployment.environment`. User configured. Enter a name to identify your deployment environment. + +## Prerequisites + +### For OTLP endpoint configuration + +Before configuring the OTEL Collector integration, ensure you have the following prerequisites in place: + +1. **Sumo Logic OTLP Source**: You need to create an OTLP source in your Sumo Logic hosted collector. The OTLP source will provide the endpoint URL that the OTEL Collector will use to send telemetry data. + + **Documentation**: [Creating a Sumo Logic OTLP Source](https://help.sumologic.com/docs/send-data/hosted-collectors/http-source/otlp/) + +### For metrics collection + +The OpenTelemetry Collector must be configured to export its own metrics using the built-in telemetry capabilities. This requires: +- OpenTelemetry Collector version 0.130.1-sumo-0 or later +- Collector configured with telemetry metrics enabled at `detailed` level +- Access to OTLP endpoint for metrics export +- Internal metrics exposed on port 8888 (default) + +### For logs collection + +The OpenTelemetry Collector must be configured to export its own logs using the built-in telemetry capabilities. This requires: +- Collector configured with telemetry logs enabled at `debug` level (automatically configured in the provided template) +- JSON encoding for structured log output (automatically configured in the provided template) +- Access to OTLP endpoint for logs export + +### System Requirements + +- OTEL Collector v0.130.1-sumo-0 or later +- Sufficient system resources (CPU, memory) for data processing +- Proper permissions for the collector service to access configured resources + + +## Collection configuration and app installation + +import ConfigAppInstall from '../../reuse/apps/opentelemetry/config-app-install.md'; + + + +### Step 1: Set up collector + +import SetupColl from '../../reuse/apps/opentelemetry/set-up-collector.md'; + + + +### Step 2: Configure integration + +OpenTelemetry works with a [configuration](https://opentelemetry.io/docs/collector/configuration/) YAML file with all the details concerning the data that needs to be collected. + +In this step, you will configure the OpenTelemetry Collector's built-in telemetry to monitor itself. + +Below are the inputs required: + +- **OTLP Endpoint**: Your Sumo Logic OTLP endpoint URL. + + +```yaml +service: + telemetry: + logs: + level: debug + development: false + encoding: json + processors: + - batch: + exporter: + otlp: + protocol: http/protobuf + endpoint: ${OTLP_ENDPOINT}/v1/logs + metrics: + level: detailed + readers: + - periodic: + exporter: + otlp: + protocol: http/protobuf + endpoint: ${OTLP_ENDPOINT}/v1/metrics + resource: + _contentType: OpenTelemetry + sumo.datasource: otel_collector + deployment.environment: ${DEPLOYMENT_ENVIRONMENT} +``` + +You can add any custom fields which you want to tag along with the data ingested in Sumo. + +import EnvVar from '../../reuse/apps/opentelemetry/env-var-required.md'; + + + +YAML + +### Step 3: Send logs and metrics to Sumo Logic + +import LogsIntro from '../../reuse/apps/opentelemetry/send-logs-intro.md'; + + + + + + + +1. Add the telemetry configuration to your existing collector configuration file in `/etc/otelcol-sumo/conf.d/` or directly in the main configuration file. +2. Place Env file in the following directory: + ```sh + /etc/otelcol-sumo/env/ + ``` +3. Restart the collector using: + ```sh + sudo systemctl restart otelcol-sumo + ``` + + + + +1. Add the telemetry configuration to your existing collector configuration file in `C:\ProgramData\Sumo Logic\OpenTelemetry Collector\config\conf.d` or the main configuration file. +2. Restart the collector using: + ```sh + Restart-Service -Name OtelcolSumo + ``` + + + + +1. Add the telemetry configuration to your existing collector configuration file in `/etc/otelcol-sumo/conf.d/` or the main configuration file. +2. Restart the otelcol-sumo process using: + ```sh + otelcol-sumo --config /etc/otelcol-sumo/sumologic.yaml --config "glob:/etc/otelcol-sumo/conf.d/*.yaml" + ``` + + + + +import ChefEnv from '../../reuse/apps/opentelemetry/chef-with-env.md'; + + + + + + + +import AnsEnv from '../../reuse/apps/opentelemetry/ansible-with-env.md'; + + + + + + + +import PuppetEnv from '../../reuse/apps/opentelemetry/puppet-with-env.md'; + + + + + + +import LogsOutro from '../../reuse/apps/opentelemetry/send-logs-outro.md'; + + + +### Validation + +After installation, verify that: +1. The OTEL Collector service is running. +2. The configured base endpoint is reachable. +3. Data is being successfully sent to both the logs (`/v1/logs`) and metrics (`/v1/metrics`) endpoints. +4. Resource attributes are properly applied to the telemetry data. +5. Internal metrics are accessible at `http://localhost:8888/metrics`. + +## Sample log messages + +```json +{ + "timestamp": "2024-01-15T10:30:45.123Z", + "level": "info", + "msg": "Batch processor started", + "component": "batch", + "pipeline": "metrics" +} +``` + +```json +{ + "timestamp": "2024-01-15T10:30:46.456Z", + "level": "warn", + "msg": "Dropping data because sending_queue is full", + "component": "sumologicexporter", + "pipeline": "logs" +} +``` + +## Sample metrics + +```json +{ + "metric": "otelcol_processor_batch_batch_send_size", + "sumo.datasource": "otel_collector", + "_contentType": "OpenTelemetry", + "deployment.environment": "production", + "processor": "batch", + "value": 100, + "timestamp": "2024-01-15T10:30:45.123Z" +} +``` + +```json +{ + "metric": "otelcol_exporter_queue_size", + "sumo.datasource": "otel_collector", + "_contentType": "OpenTelemetry", + "deployment.environment": "production", + "exporter": "sumologic", + "value": 150, + "timestamp": "2024-01-15T10:30:45.123Z" +} +``` + +## Key Internal Metrics + +The OpenTelemetry Collector emits comprehensive internal metrics categorized by verbosity levels. For a complete list of internal metrics and their descriptions, see the [OpenTelemetry Collector Internal Telemetry documentation](https://opentelemetry.io/docs/collector/internal-telemetry/#lists-of-internal-metrics). + +## Sample queries + +This sample query is from the **Pipeline Health Overview** panel. + +```sql +sumo.datasource=otel_collector +| json auto maxdepth 1 nodrop +| if (isEmpty(log), _raw, log) as _raw +| parse "* * * *" as timestamp, level, component, msg +| where level in ("error", "warn", "info", "debug") +| count by level, component +| transpose row component column level +``` + +This sample metrics query is from the **Collector Resource Usage** panel. + +```sql title="Sample metrics query" +sumo.datasource=otel_collector metric=otelcol_process_memory_rss deployment.environment=* | avg by deployment.environment +``` + +This sample query monitors queue health from the **Exporter Queue Health** panel. + +```sql +sumo.datasource=otel_collector metric=otelcol_exporter_queue_size deployment.environment=* +| avg by exporter, deployment.environment +``` + +## Viewing OpenTelemetry Collector Insights dashboards + +All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level. +- You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. [Learn more](/docs/dashboards/set-custom-time-ranges/). +- You can use template variables to drill down and examine the data on a granular level. For more information, see [Filtering Dashboards with Template Variables](/docs/dashboards/filter-template-variables/). +- **Log-based dashboards** use the `_sourceHost` filter to identify specific collector instances. +- **Metrics-based dashboards** use the `service.instance.id` filter to identify specific collector instances. + +### Overview + +The **OpenTelemetry Collector Insights - Overview** dashboard provides a high-level view of your OpenTelemetry Collector fleet's health and performance. This is your starting point for monitoring collector instances. + +Use this dashboard to: +- Monitor the overall health of your collector fleet. +- Identify performance bottlenecks and resource constraints. +- Track data flow and processing rates across collectors. +- Quickly spot collectors experiencing issues. + +Overview + +### Logs + +The **OpenTelemetry Collector Insights - Logs** dashboard provides detailed insights into collector log output for root-cause analysis of errors, data dropping events, and restarts. + +Use this dashboard to: +- Analyze error patterns and troubleshoot issues. +- Monitor collector startup and shutdown events. +- Identify data loss or processing problems. +- Track log severity trends across your collector fleet. + +Logs + +### Pipeline: Receiver Health + +The **OpenTelemetry Collector Insights - Pipeline: Receiver Health** dashboard focuses exclusively on the data ingestion stage of the pipeline to monitor data sources and receiver performance. + +Use this dashboard to: +- Monitor receiver performance and data ingestion rates. +- Identify issues with data sources and input connections. +- Track receiver-specific errors and failures. +- Analyze accepted vs refused data points. + +Pipeline Receiver Health + +### Pipeline: Processor Health + +The **OpenTelemetry Collector Insights - Pipeline: Processor Health** dashboard is crucial for understanding if any processors (like batch, memory_limiter, or resourcedetection) are dropping data or causing performance issues. + +Use this dashboard to: +- Monitor processor performance and throughput. +- Identify data drops or processing bottlenecks. +- Track processor-specific configurations and health. +- Analyze batch processing efficiency and triggers. + +Pipeline Processor Health + +### Pipeline: Exporter Health + +The **OpenTelemetry Collector Insights - Pipeline: Exporter Health** dashboard is the most critical dashboard for diagnosing backpressure and data loss at the egress stage of the pipeline. + +Use this dashboard to: +- Monitor exporter performance and success rates. +- Identify backpressure issues and export failures. +- Track data delivery to downstream systems. +- Analyze queue utilization and capacity. + +Pipeline Exporter Health + +### Resource Utilization + +The **OpenTelemetry Collector Insights - Resource Utilization** dashboard provides a deep dive into the collector's own resource consumption to diagnose performance issues and plan for capacity. + +Use this dashboard to: +- Monitor CPU, memory, and disk usage by collectors. +- Plan capacity and resource allocation. +- Identify resource constraints and optimization opportunities. +- Track heap allocation and garbage collection patterns. + +Resource Utilization + +## Troubleshooting + +### Common issues + +##### Collector connection failure + +If your collector fails to connect to Sumo Logic, you may need to configure proxy settings. Check the collector's logs for connection errors: + +```bash +# On systemd systems +journalctl --unit otelcol-sumo + +# Look for errors like "Unable to get a heartbeat" +``` + +##### High queue utilization + +Monitor the `otelcol_exporter_queue_size` and `otelcol_exporter_queue_capacity` metrics. If the queue is consistently full, you may need to: +- Reduce data ingestion rate +- Increase queue capacity +- Scale horizontally with more collectors + +##### Data dropping + +Watch for logs containing "Dropping data because sending_queue is full" and monitor failed enqueue metrics: +- `otelcol_exporter_enqueue_failed_spans` +- `otelcol_exporter_enqueue_failed_metric_points` +- `otelcol_exporter_enqueue_failed_log_records` + +### Accessing collector metrics directly + +By default, the collector's internal metrics are available in Prometheus format at `http://localhost:8888/metrics`. You can access them using: + +```bash +curl http://localhost:8888/metrics +``` + +### Log levels and configuration + +Configure different log levels for troubleshooting: +- **DEBUG**. Most verbose, includes detailed trace information +- **INFO**. Standard operational information (default) +- **WARN**. Warning messages about potential issues +- **ERROR**. Error conditions that need attention + +## Create monitors for OpenTelemetry Collector Insights app + +import CreateMonitors from '../../reuse/apps/create-monitors.md'; + + + +### OpenTelemetry Collector Insights Alerts + +| Name | Description | Alert Condition | Recover Condition | +|:--|:--|:--|:--| +| `OpenTelemetry Collector Insights - Collector Instance is Down` | This alert fires when a Collector instance stops sending telemetry for more than 10 minutes, indicating it is down or has a connectivity issue. | Missing Data | Data Found | +| `OpenTelemetry Collector Insights - Exporter Queue Nearing Capacity` | This alert fires when an exporter's sending queue is over 90% full. This is a strong leading indicator of back pressure and imminent data loss. | Count > = 90 | Count < 90 | +| `OpenTelemetry Collector Insights - High Memory Usage (RSS)` | This alert fires when a Collector's memory usage (RSS) exceeds 2GB. This could be an early indicator of a memory leak or an under-provisioned host. | Count > 2000000000 | Count < = 2000000000 | +| `OpenTelemetry Collector Insights - High Metadata Cardinality` | This alert fires when the batch processor is handling more than 1000 unique combinations of metadata. This is a known cause of performance degradation, high CPU, and high memory usage. | Count > 1000 | Count < = 1000 | \ No newline at end of file diff --git a/sidebars.ts b/sidebars.ts index 1efcce5cff..a15ce29a35 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -2674,6 +2674,7 @@ integrations: [ 'integrations/sumo-apps/infrequent-data-tier', 'integrations/sumo-apps/kickstart-data', 'integrations/sumo-apps/log-analysis-quickstart', + 'integrations/sumo-apps/opentelemetry-collector-insights', 'integrations/sumo-apps/security-analytics', ], },