You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/agent/self_hosted/monitoring_and_observability.md
+106Lines changed: 106 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -124,6 +124,112 @@ Once enabled, the agent will generate the following metrics (duration measured i
124
124
-`buildkite.jobs.duration.success.median`
125
125
-`buildkite.jobs.duration.success.95percentile`
126
126
127
+
## Buildkite agent metrics CLI
128
+
129
+
The [buildkite-agent-metrics](https://github.com/buildkite/buildkite-agent-metrics) tool is a standalone command-line binary that collects agent and job metrics from the [`metrics` endpoint of the Buildkite agent API](/docs/apis/agent-api/metrics) and publishes these metrics to a monitoring and observability backend of your choice. This tool is particularly useful for enabling autoscaling based on queue depth and agent availability.
Download the latest binary from [GitHub Releases](https://github.com/buildkite/buildkite-agent-metrics/releases), or run it as a Docker container:
143
+
144
+
```shell
145
+
docker run --rm public.ecr.aws/buildkite/agent-metrics:latest \
146
+
-token "$BUILDKITE_AGENT_TOKEN" \
147
+
-interval 30s \
148
+
-queue my-queue
149
+
```
150
+
151
+
You can also install from source using Go:
152
+
153
+
```shell
154
+
go install github.com/buildkite/buildkite-agent-metrics/v5@latest
155
+
```
156
+
157
+
### Running
158
+
159
+
The tool requires an [agent token](/docs/agent/self-hosted/tokens), which could be the same one used when [assigning the self-hosted agent to a queue](/docs/agent/queues#assigning-a-self-hosted-agent-to-a-queue), or another agent token configured within the same [cluster](/docs/pipelines/security/clusters). The simplest deployment runs it as a long-running daemon that collects metrics across all queues in an organization:
For more details on configuration options, AWS Lambda deployment, and backend-specific settings, see the [buildkite-agent-metrics README](https://github.com/buildkite/buildkite-agent-metrics?tab=readme-ov-file#buildkite-agent-metrics).
232
+
127
233
## Tracing
128
234
129
235
For Datadog APM or OpenTelemetry tracing, see [Tracing in the Buildkite agent](/docs/agent/self-hosted/monitoring-and-observability/tracing).
Copy file name to clipboardExpand all lines: pages/apis/agent_api.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ The agent REST API is used to retrieve agent metrics, register agents, de-regist
4
4
5
5
The agent REST API's _publicly_ available endpoints include:
6
6
7
-
-[`/metrics`](/docs/apis/agent-api/metrics): Used to retrieve information about current self-hosted agents associated with a Buildkite cluster. The [Buildkite Agent Metrics](https://github.com/buildkite/buildkite-agent-metrics) CLI tool uses the data returned by the metrics endpoint for agent autoscaling.
7
+
-[`/metrics`](/docs/apis/agent-api/metrics): Used to retrieve information about current self-hosted agents associated with a Buildkite cluster. The [buildkite-agent-metrics](/docs/agent/self-hosted/monitoring-and-observability#buildkite-agent-metrics-cli) CLI tool uses the data returned by the metrics endpoint for agent autoscaling.
8
8
-[`/stacks`](/docs/apis/agent-api/stacks): Used to implement a _stack_ on a self-hosted queue. A stack is a long-running controller process that watches the queue for jobs, and runs Buildkite agents on demand to run these jobs.
9
9
10
10
All other endpoints in the agent API are intended only for use by the Buildkite agent, therefore stability and backwards compatibility are not guaranteed, and changes won't be announced.
Copy file name to clipboardExpand all lines: pages/pipelines/best_practices/agent_management.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,7 +96,7 @@ Learn more about using clusters and queues in [Managing clusters](/docs/pipeline
96
96
97
97
## Right-sizing of your agent fleet
98
98
99
-
- Monitor queue times with [cluster insights](/docs/pipelines/security/clusters#cluster-insights) and [Buildkite agent Metrics](https://github.com/buildkite/buildkite-agent-metrics).
99
+
- Monitor queue times with [cluster insights](/docs/pipelines/security/clusters#cluster-insights) and the [buildkite-agent-metrics](/docs/agent/self-hosted/monitoring-and-observability#buildkite-agent-metrics-cli) tool.
100
100
- Use cloud-based autoscaling ([Elastic CI Stack for AWS](https://github.com/buildkite/elastic-ci-stack-for-aws), [Buildkite agent Scaler](https://github.com/buildkite/buildkite-agent-scaler), [Agent Stack for Kubernetes](/docs/agent/self-hosted/agent-stack-k8s)).
101
101
- Maintain dedicated pools for CPU-intensive, GPU-enabled, or OS-specific workloads.
102
102
- Configure [graceful termination](/docs/agent/lifecycle#signal-handling) to allow jobs to complete.
Copy file name to clipboardExpand all lines: pages/pipelines/best_practices/parallel_builds.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -147,7 +147,7 @@ In addition to the [Elastic CI Stack for AWS](/docs/agent/self-hosted/aws/elasti
147
147
- [Pipelines REST API](/docs/apis/rest-api/pipelines) and [Agents API](/docs/apis/rest-api/agents) you're able to fetch each pipeline's job count, and information about each agent.
148
148
- [Agent priorities](/docs/agent/self-hosted/prioritization) allow you to define which agents are assigned work first, such as high performance ephemeral agents.
149
149
- [Agent queues](/docs/agent/queues) allow you to divide your agent pools into separate groups for scaling and performance purposes.
150
-
- [buildkite-agent-metrics](https://github.com/buildkite/buildkite-agent-metrics) tool allow you to collect your organization's Buildkite metrics and report them to AWS CloudWatchand StatsD.
150
+
- [buildkite-agent-metrics](/docs/agent/self-hosted/monitoring-and-observability#buildkite-agent-metrics-cli) tool allows you to collect your organization's Buildkite metrics and report them to a range of backends including AWS CloudWatch, StatsD, Prometheus, and OpenTelemetry.
151
151
152
152
Using these tools you can automate your build infrastructure, scale your agents based on demand, and massively reduce build times using job parallelism.
0 commit comments