Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions blog-service/2025-11-14-manage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: System Limits Visibility (Manage)
image: https://assets-www.sumologic.com/company-logos/_800x418_crop_center-center_82_none/SumoLogic_Preview_600x600.jpg?mtime=1617040082
keywords:
- system-limits-visibility
- manage
- health-events
hide_table_of_contents: true
---

We’re excited to announce that Health Events are now automatically generated when 90% credit usage threshold is exceeded for Lookup Tables, Partitions, Fields, or Field Extraction Rules (FERs). These health events can further be configured to receive timely alerts whenever a threshold breach occurs, ensuring that all designated recipients are promptly notified when the health event is triggered every time. [Learn more](/docs/manage/health-events).
198 changes: 106 additions & 92 deletions docs/manage/health-events.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,35 @@
---
id: health-events
title: Health Events
description: Monitor the health of your Collectors and Sources.
description: Monitor the health of your Collectors, Sources, and Log data.
---

## Availability

| Account Type | Account Level |
|:--------------|:---------------------------------------------------------------------------------|
| CloudFlex | Professional, Enterprise |
| Credits | Trial, Essentials, Enterprise Operations, Enterprise Security, Enterprise Suite |

Health events allow you to keep track of the health of your Collectors, Sources, and Ingest Budgets. You can use them to find and investigate common errors and warnings that are known to cause collection issues. 

This framework includes the following:

* Health event logs indexed in the [System Event Index](/docs/manage/security/audit-indexes/system-event-index).
* A [health events table](#health-events-table) on the Alerts page.
* A health status column on the [Collection page](#collection-page).

Health events are sent from Installed Collectors on version 19.308-2 and
later.

## Alerts

Alerts for specific health events are easy to create in the Health Events Table. The details pane of an event provides a **Create Scheduled Search** button to automatically generate the required query.

## Health events

Health events are created when an issue is detected with a Collector or Source. Events are indexed and searchable in a separate partition named **sumologic_system_events** in the [System Event Index](/docs/manage/security/audit-indexes/system-event-index). For details on what information is available in a health event, see the [common parameters](#common-parameters) table.

### Health events table
import useBaseUrl from '@docusaurus/useBaseUrl';

The health events table allows you to easily view and investigate problems getting your data to Sumo.
System Health Events are generated automatically when the system detects an issue within a Collector or Source, or when a credit usage threshold is exceeded for Lookup Tables, Partitions, Fields, or Field Extraction Rules (FERs).

On the health events table, you can search, filter, and sort incidents by key aspects like severity, resource name, event name, resource type, and opened since date.
These events provide visibility into the operational health of Collectors, Sources, and Ingest Budgets, enabling administrators to monitor performance and identify potential issues proactively. Health events also help in investigating common errors and warnings known to affect data collection and processing.

[**New UI**](/docs/get-started/sumo-logic-ui/). To access the health events table, in the main Sumo Logic menu select **Data Management**, and then under **Data Collection** select **Health Events**. You can also click the **Go To...** menu at the top of the screen and select **Health Events**.
Additionally, a health event is triggered when any credit limit associated with Lookup Tables, Partitions, Fields, or FERs reaches or exceeds 90% of the allocated capacity, allowing timely action to prevent service disruption. This health event will auto-resolve when the usage falls back below the 90% threshold limit.

[**Classic UI**](/docs/get-started/sumo-logic-ui-classic). To access the health events table, in the main Sumo Logic menu select **Manage Data > Monitoring > Health Events**.

![health events table.png](/img/health-events/health-events-table.png)

Click on a row to view the details of a health event.

![health event detail.png](/img/health-events/health-event-detail.png)

Click the **Create Scheduled Search** button on the details pane to get alerts for specific health events. The unique identifier of the resource, such as the Source or Collector, is used in the query. See [Schedule a Search](../alerts/scheduled-searches/schedule-search.md) for details.

Under the **More Actions** menu you can select:

* **Event History** to run a search against the **sumologic_system_events** partition to view all of the related event logs.
* **View Object** to view the Collector or Source in the Collection page related to the event.

### Health events severity
:::note
Health events are sent from Installed Collectors of version `19.308-2` and later.
:::

Events are categorized by two severity levels, warning and error. The severity column has color-coded error and warning events so you can quickly determine the severity of a given issue.
## Availability

* ![warning label.png](/img/health-events/warning-label.png) A warning indicates the Collector or Source has a configuration issue or is operating in a degraded state.
* ![Error label.png](/img/health-events/Error-label.png) An error indicates the Collector or Source is unable to collect data as expected.
| Account Type | Account Level |
|:--------------|:---------------------------------------------------------------------------------|
| CloudFlex | Professional, Enterprise |
| Credits | Trial, Essentials, Enterprise Operations, Enterprise Security, Enterprise Suite |

### Common parameters
## Event schema

Each health event log has common keys that categorize it to a product
area and provide details of the event. The following table shows the
common parameters in the order that they are found in health event logs.
This section defines the structure of System Health Events, including all key parameters and their descriptions. The example below illustrates a sample health event in JSON format, followed by a parameter table explaining each field for better understanding and analysis.

| Parameter | Description | Data Type |
|:--|:--|:--|
| status | Either `Healthy` or `Unhealthy` based on the event. | String |
| details | The details of the event include the type as `trackerId`, the `name` of the event, and a `description`. | JSON object of Strings |
| eventType | Health events have a value of `Health-Change`. | String |
| severityLevel | Either `Error` or `Warning` based on the event. | String |
| accountId | The unique identifier of the organization. | String |
| eventId | The unique identifier of the event. | String |
| eventName | The name of the event. | String |
| eventTime | The event timestamp in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format. | String |
| eventFormatVersion | The event log format version. | String |
| operator | Information on who did the operation. If it's missing, the Sumo service was the operator. | JSON object of Strings |
| subsystem | The product area of the event. | String |
| resourceIdentity | This includes any unique identifiers, names, and the type of the object associated with the event. | JSON object of Strings |
### JSON example

### Health event log example

```json
```json title="Sample Health Event"
{
"status": "UnHealthy",
"details": {
Expand All @@ -109,10 +55,94 @@ common parameters in the order that they are found in health event logs.
}
```

### Parameters table

Each health event log has common keys that categorize it to a product area and provide details of the event. The following table shows the common parameters in the order that they are found in health event logs.

| Parameter | Description | Data type |
|:--|:--|:--|
| status | Either `Healthy` or `Unhealthy` based on the event. | String |
| details | The details of the event include the type as `trackerId`, the `name` of the event, and a `description`. | JSON object of Strings |
| eventType | Health events have a value of `Health-Change`. | String |
| severityLevel | Either `Error` or `Warning` based on the event. | String |
| accountId | The unique identifier of the organization. | String |
| eventId | The unique identifier of the event. | String |
| eventName | The name of the event. | String |
| eventTime | The event timestamp in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format. | String |
| eventFormatVersion | The event log format version. | String |
| operator | Information on who did the operation. If it's missing, the Sumo service was the operator. | JSON object of Strings |
| subsystem | The product area of the event. | String |
| resourceIdentity | This includes any unique identifiers, names, and the type of the object associated with the event. | JSON object of Strings |

## Configure Scheduled Search

Configuring the scheduled search for the selected health event will help you with timely alerts to all the recipients when the health event is triggered every time. To configure, follow the below steps:

1. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). Go to **Manage Data > Monitoring > Health Events**.<br/>[**New UI**](/docs/get-started/sumo-logic-ui). In the main Sumo Logic menu select **Data Management**, and then under **Data Collection** select **Health Events**. <br/><img src={useBaseUrl('/img/health-events/health-events-table.png')} alt="health-events-table" style={{border: '1px solid gray'}} width="800"/>
1. Click on the required row to view the details of a health event. <br/><img src={useBaseUrl('/img/health-events/health-event-detail.png')} alt="health-events-detial" style={{border: '1px solid gray'}} width="400"/>
1. Click the **Create Scheduled Search** button and configure it based on your requirement. For more details, refer to [Create a Scheduled Search](/docs/alerts/scheduled-searches/schedule-search/).
:::info
Query will be auto-generated for the selected health event.
:::

Use the below scheduled search query to get an alert when 90% threshold is exceeded for Lookup Tables, Partitions, Fields, or Field Extraction Rules (FERs).

``` sql
_index=sumologic_system_events "0000000007063B25"
| json "eventType", "resourceIdentity.id" as eventType , resourceId
| where eventType = "Health-Change" AND resourceId = "0000000007063B25"
```

For specific `eventType`, `resourceId`, `eventName`:

```sql
_index=sumologic_system_events "0000000007063B25"
| json "eventType", "resourceIdentity.id","eventName" as eventType, resourceId, eventName
| where eventType = "Health-Change" AND resourceId = "0000000007063B25" AND eventName='LookupsLimitApproaching'
```

## View Health Events

The health events table allows you to easily view and investigate problems which occurs while injecting the data to Sumo Logic. On the health events table, you can search, filter, and sort incidents by key aspects like severity, resource name, event name, resource type, and opened since date.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The health events table allows you to easily view and investigate problems which occurs while injecting the data to Sumo Logic. On the health events table, you can search, filter, and sort incidents by key aspects like severity, resource name, event name, resource type, and opened since date.
The health events table allows you to easily view and investigate problems which occur while injecting the data to Sumo Logic. On the health events table, you can search, filter, and sort incidents by key aspects like severity, resource name, event name, resource type, and opened since date.


:::info
It may take up to 15 minutes for a 90% usage breach for Lookup Tables, Partitions, Fields, or Field Extraction Rules (FERs) to reflect on the Health Events page after detection.
:::

1. [**Classic UI**](/docs/get-started/sumo-logic-ui-classic). Go to **Manage Data > Monitoring > Health Events**.<br/>[**New UI**](/docs/get-started/sumo-logic-ui). In the main Sumo Logic menu select **Data Management**, and then under **Data Collection** select **Health Events**. <br/><img src={useBaseUrl('/img/health-events/health-events-table.png')} alt="health-events-table" style={{border: '1px solid gray'}} width="800"/>
1. Click on the required row to view the details of a health event. <br/><img src={useBaseUrl('/img/health-events/health-event-detail.png')} alt="health-events-detial" style={{border: '1px solid gray'}} width="400"/>
- **Create Scheduled Search**. Click this button to get alerts for specific health events. The unique identifier of the resource type is used in the query. See [Schedule a Search](../alerts/scheduled-searches/schedule-search.md) for details.
- Under the **More Actions** menu you can select:
* **Event History** to run a search against the **sumologic_system_events** partition to view all of the related event logs.
* **View Object** to view the resource in detail related to the event.
- **Description**. Provides the information about the health events error or warning.
- **Severity**. Events are categorized by two severity levels, warning, and error. The severity column has color-coded error and warning events so you can quickly determine the severity of a given issue.
* ![warning label.png](/img/health-events/warning-label.png) A warning indicates the Collector or Source has a configuration issue or is operating in a degraded state.
* ![Error label.png](/img/health-events/Error-label.png) An error indicates the Collector or Source is unable to collect data as expected.
- **Event Name**. The name or type of the health event that occurred. This identifies what kind of issue or status change was detected.
- **Resource Type**. The category or class of resource affected by the event. For example, Collectors, Sources, or Organizations.
- **Resource ID**. A unique identifier for the affected resource.
- **Created At**. The timestamp indicating when the event was generated by the monitoring system.
- **Collector ID**. The unique identifier of the collector that detected and reported the event. This field is only available for *Source* resource type.
- **Collector Name**. The name of the collector associated with the event. This field is only available for *Source* resource type.
- **Error**. A brief summary or title of the detected issue.
- **Service**. Displays the specific resource or service affected by the event.
- **Error Code**. A numeric code associated with the error, that provides a quick reference for troubleshooting or mapping to known issue types.
- **Error Info**. Detailed information about the event. This may include error context and suggested corrective actions.
- **Minutes Since Last Heartbeat**. The number of minutes that have elapsed since the system last received a heartbeat signal from the resource. A higher number may indicate the resource is offline or unresponsive. This field is only available for *Collector* resource type.

## View Health Events in Collection page

A **Health** column on the Collection page shows color-coded healthy, error, and warning states for Collectors and Sources to quickly determine the health of your Collectors and Sources.<br/><img src={useBaseUrl('/img/health-events/Collection-health-column.png')} alt="Collection-health-column" style={{border: '1px solid gray'}} width="800"/>

To view the number of health events associated with the Collector or Source, perform the following steps:

1. Hover over a **Health** status to view a tooltip that provides the number of health events detected on the selected Collector or Source. <br/><img src={useBaseUrl('/img/health-events/health_tooltip.png')} alt="health_tooltip" style={{border: '1px solid gray'}} width="200"/>
1. Click on the **Health** status of a Collector or Source to view a pop-up displaying a list of related events. <br/><img src={useBaseUrl('/img/health-events/object_event_details.png')} alt="object_event_details" style={{border: '1px solid gray'}} width="500"/>

## Search health events

To search all health events run a query against the internal partition
named **sumologic_system_events**. For example,
Events are indexed and searchable in a separate partition named **sumologic_system_events** in the [System Event Index](/docs/manage/security/audit-indexes/system-event-index). To search all health events run a query against the internal partition named **sumologic_system_events**. For example,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Events are indexed and searchable in a separate partition named **sumologic_system_events** in the [System Event Index](/docs/manage/security/audit-indexes/system-event-index). To search all health events run a query against the internal partition named **sumologic_system_events**. For example,
Events are indexed and searchable in a separate partition named `sumologic_system_events` in the [System Event Index](/docs/manage/security/audit-indexes/system-event-index). To search all health events run a query against the internal partition named `sumologic_system_events`. For example,


```sql
_index=sumologic_system_events "Health-Change"
Expand All @@ -128,22 +158,6 @@ Creating a query that defines built-in metadata field values in the scope can he

| **Metadata Field** | **Assignment Description** |
|:--|:--|
| _sourceCategory | Value of the [common parameter](#common-parameters), `subsystem`. |
| _sourceName | Value of the [common parameter](#common-parameters), `eventName`. |
| _sourceHost | The remote IP address of the host that made the request. If not available the value will be `no_sourceHost`. |

### Collection page

A **Health** column on the Collection page shows color-coded healthy, error, and warning states for Collectors and Sources so you can quickly determine the health of your Collectors and Sources.

The **status** column now shows the status of Sources manually paused by users.

![Collection health column.png](/img/health-events/Collection-health-column.png)

* Hover your mouse over a Collector or Source to view a tooltip that provides the number of health events detected on the Collector or Source.

![health tooltip.png](/img/health-events/health_tooltip.png)

* Click on the **Health** status in a row to view a pop-up displaying a list of related events.

![object event details.png](/img/health-events/object_event_details.png)
| _sourceCategory | Value of the [common parameter](#parameters-table), `subsystem`. |
| _sourceName | Value of the [common parameter](#parameters-table), `eventName`. |
| _sourceHost | The remote IP address of the host that made the request. If not available the value will be `no_sourceHost`. |
2 changes: 1 addition & 1 deletion docs/metrics/introduction/metric-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ cluster=cluster-1 node=node-1 cpu=cpu-1 metric=cpu_idle 97.29 1460061337

### Mandatory metric name

Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [About Health Events](/docs/manage/health-events#health-events)).
Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [About Health Events](/docs/manage/health-events)).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [About Health Events](/docs/manage/health-events)).
Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [Health Events](/docs/manage/health-events)).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [About Health Events](/docs/manage/health-events)).
Unlike Prometheus, Carbon 2.0 format doesn't enforce the presence of a metric name. It also cannot be reliably inferred automatically. Therefore, Sumo Logic requires a `metric` key to be present among `intrinsic_tags`. All metrics without a `metric` key specified will not be ingested to Sumo Logic and a `MetricsMetricNameMissing` Health Event for the associated Metric Source will be triggered (for more information on Health Events, see [Health Events](/docs/manage/health-events)).


For example, the following metric will be correctly ingested to Sumo Logic:
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ If the Source has any issues during any one of these states, it is placed in an

When you delete the Source, it is placed in a **Stopping** state. When it has successfully stopped, it is deleted from your Hosted Collector.

On the [Collection page](/docs/manage/health-events#collection-page), the Health and Status for Sources is displayed. Use [Health Events](/docs/manage/health-events.md) to investigate issues with collection.
On the [Collection page](/docs/manage/health-events#view-health-events-in-collection-page), the Health and Status for Sources is displayed. Use [Health Events](/docs/manage/health-events) to investigate issues with collection.

Hover your mouse over the status icon to view a tooltip with a count of the detected errors and warnings. You can click on the status icon to open a Health Events panel with details on each detected issue.

Expand Down
Loading