Skip to content

Commit 977084e

Browse files
marciocloudflareOxyjun
authored andcommitted
[MT] Endpoint health checks (cloudflare#23911)
* added beta badge * added content * added curl component * added api partial info * updated url * refined text * Apply suggestions from code review Co-authored-by: Jun Lee <[email protected]> --------- Co-authored-by: Jun Lee <[email protected]>
1 parent 3433627 commit 977084e

File tree

1 file changed

+129
-2
lines changed

1 file changed

+129
-2
lines changed
Lines changed: 129 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11
---
22
pcx_content_type: how-to
3-
title: Run endpoint health checks
3+
title: Run endpoint health checks (beta)
44
sidebar:
5+
badge:
6+
text: Beta
57
order: 1
8+
label: Run endpoint health checks
69

710
---
811

12+
import { CURL, Render } from '~/components';
13+
914
Magic Transit uses endpoint health checks to determine the overall health of your [inter-network connections](/magic-transit/reference/gre-ipsec-tunnels/). Probes originate from Cloudflare infrastructure, outside customer network namespaces, and target IP addresses deep within your network, beyond the tunnel-terminating border router. These "long distance" probes are purely diagnostic.
1015

1116
When choosing which endpoint IP addresses to monitor with health checks, use these guidelines:
@@ -15,11 +20,133 @@ When choosing which endpoint IP addresses to monitor with health checks, use the
1520

1621
Cloudflare pings health check IPs from within the [published Cloudflare IP range](https://www.cloudflare.com/ips/), which is also available via the [Cloudflare API](/api/resources/ips/methods/list/).
1722

18-
Refer to the table below for an example of an endpoint health check configuration.
23+
When configuring an endpoint health check for an IP prefix, you need to select an IP address that is within the range of that IP prefix. Refer to the table below for an example of an endpoint health check configuration.
1924

2025
| Prefix | Endpoint IP address |
2126
| ----------------- | ------------------- |
2227
| `103.21.244.0/24` | `103.21.244.100` |
2328
| `103.21.245.0/24` | `103.21.245.100` |
2429

2530
Refer to [Tunnel health checks](/magic-transit/reference/tunnel-health-checks/) for more information on this topic.
31+
32+
## Configure endpoint health checks (beta)
33+
34+
Endpoint health checks can only be configured via the Cloudflare API. Endpoint health checks can not be configured via the dashboard, and they are not shown in the dashboard. Currently, configuring health checks is a beta feature.
35+
36+
Refer to the [API documentation](/api/resources/diagnostics/subresources/endpoint-healthchecks/) to learn how to create, list, and delete endpoint health checks. Here is an example of an API request which sends a request to the Cloudflare API to create a new endpoint health check.
37+
38+
<Render file="account-id-api-key" product="networking-services" />
39+
40+
<CURL
41+
url="https://api.cloudflare.com/client/v4/accounts/account_id/diagnostics/endpoint-healthcheck"
42+
method="POST"
43+
json={{
44+
"check_type": "icmp",
45+
"endpoint": "8.31.160.1"
46+
}}
47+
/>
48+
49+
```json output
50+
{
51+
"result": {
52+
"id": "<HEALTH_CHECK_ID>",
53+
"check_type": "icmp",
54+
"endpoint": "8.31.160.1"
55+
},
56+
"success": true,
57+
"errors": [],
58+
"messages": []
59+
}
60+
```
61+
62+
## Query GraphQL endpoint health checks
63+
64+
You can also use GraphQL to query endpoints within your network. This allows you to know if an IP within your network is reachable, and also set alerts in case there is a problem.
65+
66+
You can query the following categories:
67+
68+
- `checkId`: The ID of the check associated with the health check.
69+
- `checkType`: The type of check associated with the health check.
70+
- `date`: The health check event timestamp truncated to the day.
71+
- `datetime`: The health check event timestamp.
72+
- `datetimeFifteenMinutes`: The health check event timestamp truncated to multiples of 15 minutes.
73+
- `datetimeFiveMinutes`: The health check event timestamp truncated to multiples of five minutes.
74+
- `datetimeHalfOfHour`: The health check event timestamp truncated to multiples of 30 minutes.
75+
- `datetimeHour`: The health check event timestamp truncated to the hour.
76+
- `datetimeMinute`: The health check event timestamp truncated to the minute.
77+
- `endpoint`: The endpoint of the check associated with the health check.
78+
79+
Refer to the example below on how to create a query to check your endpoint's health:
80+
81+
```graphql
82+
query Viewer {
83+
viewer {
84+
accounts(filter: { accountTag: "YOUR_ACCOUNT_TAG" }) {
85+
magicEndpointHealthCheckAdaptiveGroups(
86+
filter: { date_geq: "2025-05-10" }
87+
limit: 10
88+
) {
89+
count
90+
dimensions {
91+
checkId
92+
checkType
93+
date
94+
datetime
95+
datetimeFifteenMinutes
96+
datetimeFiveMinutes
97+
datetimeHalfOfHour
98+
datetimeHour
99+
datetimeMinute
100+
endpoint
101+
}
102+
sum {
103+
failures
104+
total
105+
}
106+
}
107+
}
108+
}
109+
}
110+
```
111+
112+
## Configure alerts for endpoint health checks
113+
114+
You can set up alerts to be notified when the state of your endpoint's health is below a threshold defined by you.
115+
116+
1. Make a `GET` request to get a list of IDs for all of the endpoint health checks configured:
117+
118+
<CURL
119+
url="https://api.cloudflare.com/client/v4/accounts/<account id>/diagnostics/endpoint-healthcheck"
120+
method="GET"
121+
/>
122+
123+
```json output
124+
{
125+
"result": [
126+
{
127+
"id": "<HEALTH_CHECK_ID>",
128+
"check_type": "icmp",
129+
"endpoint": "8.31.160.1"
130+
}
131+
],
132+
"success": true,
133+
"errors": [],
134+
"messages": []
135+
}
136+
```
137+
138+
2. Take note of the `id` value for the endpoint you want to get alerts for.
139+
3. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account.
140+
4. Go to **Notifications** > **Add**.
141+
5. From the drop down menu, select *Magic Transit*.
142+
6. Select **Magic Endpoint Health Check Alert**.
143+
7. Provide a name for your new notification and optionally provide a description.
144+
8. In the *Service Level Objective (SLO)* dropdown, select the SLO threshold for your notification. The SLO defines the percentage of endpoint health checks that should be passing. If the number of endpoint health checks passing is less than the SLO, an alert is generated:
145+
- **High** - 99% of endpoint health checks
146+
- **Medium** - 98% of endpoint health checks
147+
- **Low** - 97% of endpoint health checks
148+
9. In the dropdown menu below SLOs, select the `id` value that matches the `id` you got through the API in step one. This `id` should match the endpoint health check you want to get notifications for.
149+
10. Select your preferred notification method (like email or Webhooks).
150+
11. Select **Save**.
151+
152+
You will now receive notifications via your preferred method whenever the SLO for your endpoint health checks falls below your chosen threshold.

0 commit comments

Comments
 (0)