Skip to content

Commit 0caed44

Browse files
clayton-cornellCopilotptodev
authored
Refactor the Alloy Get Started topics (#4591)
* Restructure TOC, merge content * Initial rewrite of yntax content * Second pass at cleaning up syntax explanation * Large refactor for consistent info style and flow * Clean up H1 and title metadata * Simplify the get started heading and title * Better title for syntax * Fix typo in metadata * Restore old aliases * Tweaks to the aliases * Add clarification from PR #4692 * Update text to force commit * fix a broken link * Fix some of the passive voice issues * More tweaks for passive vs active voice * One more passive voice fix * Update docs/sources/get-started/components/community-components.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/modules.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/clustering.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/components/community-components.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/components/custom-components.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/expressions/operators.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/_index.md Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/components/community-components.md Co-authored-by: Copilot <[email protected]> * Remove redundant information, and fix duplicate heading * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Update docs/sources/get-started/components/configure-components.md Co-authored-by: Copilot <[email protected]> * Refactor to make more engaging and reduce duplication * More restructing to improve info flow * Fix the topic weight * Update topic order and improve info flow * Clean up informaiton flow * More style and content updates * More updates for info flow and consistency * Reshuffling info to improve info flow * Fix some Vale linting errors * More concept flow cleanup * Clean up a few broken links, fix some examples, and add missing text * Add more context and info to teh landing page * Add loki.source.file syntax tests * Apply suggestion from @ptodev Co-authored-by: Paulin Todev <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: Paulin Todev <[email protected]>
1 parent fc8ed8f commit 0caed44

28 files changed

+2143
-975
lines changed

docs/sources/get-started/_index.md

Lines changed: 138 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,147 @@
11
---
22
canonical: https://grafana.com/docs/alloy/latest/get-started/
33
aliases:
4+
- ./configuration-syntax/ # /docs/alloy/latest/get-started/configuration-syntax/
5+
- ./configuration-syntax/files/ # /docs/alloy/latest/get-started/configuration-syntax/files/
46
- ./concepts/ # /docs/alloy/latest/concepts/
7+
- ./concepts/configuration-syntax/ # /docs/alloy/latest/concepts/configuration-syntax/
8+
- ./concepts/configuration-syntax/files/ # /docs/alloy/latest/concepts/configuration-syntax/files/
59
description: Get started with Grafana Alloy
6-
title: Get started
7-
weight: 40
10+
title: Get started with Grafana Alloy
11+
menuTitle: Get started
12+
weight: 20
813
---
914

10-
# Get started
15+
# Get started with {{% param "FULL_PRODUCT_NAME" %}}
1116

12-
This section helps you get started with {{< param "FULL_PRODUCT_NAME" >}}.
17+
{{< param "FULL_PRODUCT_NAME" >}} uses a configuration language to define how components collect, transform, and send data.
18+
Components are building blocks that perform specific tasks, such as reading files, collecting metrics, or sending data to external systems.
1319

14-
{{< section >}}
20+
To write effective configurations, you need to understand three fundamental elements: blocks, attributes, and expressions.
21+
Mastering these building blocks lets you create powerful data collection and processing pipelines.
22+
23+
## Basic configuration elements
24+
25+
All {{< param "PRODUCT_NAME" >}} configurations use three main elements: blocks, attributes, and expressions.
26+
27+
## Blocks
28+
29+
Blocks group related settings and configure different parts of {{< param "PRODUCT_NAME" >}}.
30+
Each block has a name and contains attributes or nested blocks.
31+
32+
```alloy
33+
prometheus.remote_write "production" {
34+
endpoint {
35+
url = "http://localhost:9009/api/prom/push"
36+
}
37+
}
38+
```
39+
40+
This example contains two blocks:
41+
42+
- `prometheus.remote_write "production"`: Creates a component with the label `"production"`
43+
- `endpoint`: A nested block that configures connection settings
44+
45+
## Attributes
46+
47+
Attributes set individual values within blocks.
48+
They follow the format `ATTRIBUTE_NAME = ATTRIBUTE_VALUE`.
49+
50+
```alloy
51+
log_level = "debug"
52+
timeout = 30
53+
enabled = true
54+
```
55+
56+
## Expressions
57+
58+
Expressions compute values for attributes.
59+
You can use simple constants or more complex calculations.
60+
61+
**Constants:**
62+
63+
```alloy
64+
name = "my-service"
65+
port = 9090
66+
tags = ["web", "api"]
67+
```
68+
69+
**Simple calculations:**
70+
71+
You can use arithmetic operations to compute values from other variables.
72+
This lets you build dynamic configurations where values depend on other settings.
73+
74+
```alloy
75+
total_timeout = base_timeout + retry_timeout
76+
```
77+
78+
**Function calls:**
79+
80+
Function calls let you access system information and transform data.
81+
[Built-in][] functions like `sys.env()` retrieve environment variables, while others can manipulate strings, decode JSON, and perform other operations.
82+
83+
```alloy
84+
home_dir = sys.env("HOME")
85+
config_path = home_dir + "/config.yaml"
86+
```
87+
88+
**Component references:**
89+
90+
Component references let you use data from other parts of your configuration.
91+
To reference a component's data, combine three parts with periods:
92+
93+
- Component name: `local.file`
94+
- Label: `secret`
95+
- Export name: `content`
96+
- Result: `local.file.secret.content`
97+
98+
```alloy
99+
password = local.file.secret.content
100+
```
101+
102+
You'll learn about more powerful expressions in the dedicated [Expressions][] section, including how to reference data from other components and use more built-in functions.
103+
You can find the available exports for each component in the [Components][components] documentation.
104+
105+
## Configuration syntax
106+
107+
{{< param "PRODUCT_NAME" >}} uses a declarative configuration language, which means you describe what you want your system to do rather than how to do it.
108+
This design makes configurations flexible and easy to understand.
109+
110+
You can organize blocks and attributes in any order that makes sense for your use case.
111+
{{< param "PRODUCT_NAME" >}} automatically determines the dependencies between components and evaluates them in the correct order.
112+
113+
## Configuration files
114+
115+
{{< param "PRODUCT_NAME" >}} configuration files conventionally use a `.alloy` file extension, though you can name single files anything you want.
116+
If you specify a directory path, {{< param "PRODUCT_NAME" >}} processes only files with the `.alloy` extension.
117+
You must save your configuration files as UTF-8 encoded text - {{< param "PRODUCT_NAME" >}} can't parse files with invalid UTF-8 encoding.
118+
119+
## Tooling
120+
121+
You can use these tools to write {{< param "PRODUCT_NAME" >}} configuration files:
122+
123+
- Editor support:
124+
- [VS Code](https://github.com/grafana/vscode-alloy)
125+
- [Vim/Neovim](https://github.com/grafana/vim-alloy)
126+
- Code formatting: [`alloy fmt` command][fmt]
127+
128+
## Next steps
129+
130+
Now that you understand the basic syntax, learn how to use these elements to build working configurations:
131+
132+
- [Components][] - Learn about the building blocks that collect, transform, and send data
133+
- [Expressions][] - Create dynamic configurations using functions and component references
134+
- [Alloy syntax][] - Explore advanced syntax features and patterns
135+
136+
For hands-on learning:
137+
138+
- [Tutorials][] - Build complete data collection pipelines step by step
139+
- [Components][components] - Browse all available components and their options
140+
141+
[fmt]: ../../reference/cli/fmt/
142+
[Built-in]: ../reference/stdlib/
143+
[Alloy syntax]: ./syntax/
144+
[Components]: ./components/
145+
[Expressions]: ./expressions/
146+
[tutorials]: ../tutorials/
147+
[components]: ../reference/components/

docs/sources/get-started/clustering.md

Lines changed: 70 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,51 +3,71 @@ canonical: https://grafana.com/docs/alloy/latest/get-started/clustering/
33
aliases:
44
- ../concepts/clustering/ # /docs/alloy/latest/concepts/clustering/
55
description: Learn about Grafana Alloy clustering concepts
6-
menuTitle: Clustering
76
title: Clustering
8-
weight: 500
7+
weight: 70
98
---
109

1110
# Clustering
1211

13-
Clustering allows a fleet of {{< param "PRODUCT_NAME" >}} deployments to work together for workload distribution and high availability.
12+
You learned about components, expressions, syntax, and modules in the previous sections.
13+
Now you'll learn about clustering, which allows multiple {{< param "PRODUCT_NAME" >}} deployments to work together for distributed data collection.
14+
15+
Clustering provides workload distribution and high availability.
1416
It enables horizontally scalable deployments with minimal resource and operational overhead.
1517

16-
{{< param "PRODUCT_NAME" >}} uses an eventually consistent model to achieve clustering.
17-
This model assumes all participating {{< param "PRODUCT_NAME" >}} deployments are interchangeable and converge on the same configuration file.
18+
{{< param "PRODUCT_NAME" >}} uses an eventually consistent model with a gossip protocol to achieve clustering.
19+
This model assumes all participating {{< param "PRODUCT_NAME" >}} deployments are interchangeable and use identical configurations.
20+
The cluster uses a consistent hashing algorithm to distribute work among nodes.
1821

1922
A standalone, non-clustered {{< param "PRODUCT_NAME" >}} behaves the same as a single-node cluster.
2023

21-
You configure clustering by passing `cluster` command-line flags to the [run][] command.
24+
You configure clustering by passing `--cluster.*` command-line flags to the [`alloy run`][run] command.
25+
Cluster-enabled components must explicitly enable clustering through a `clustering` block in their configuration.
2226

2327
## Use cases
2428

29+
Clustering serves several purposes in {{< param "PRODUCT_NAME" >}} deployments, with the primary focus being workload distribution and scalability.
30+
2531
### Target auto-distribution
2632

27-
Target auto-distribution is the simplest use case of clustering.
33+
Target auto-distribution is the most common use case of clustering.
2834
It lets scraping components running on all peers distribute the scrape load among themselves.
29-
Target auto-distribution requires all {{< param "PRODUCT_NAME" >}} deployments in the same cluster to access the same service discovery APIs and scrape the same targets.
35+
36+
For target auto-distribution to work:
37+
38+
1. All {{< param "PRODUCT_NAME" >}} deployments in the same cluster must access the same service discovery APIs.
39+
1. All deployments must scrape the same targets.
3040

3141
You must explicitly enable target auto-distribution on components by defining a `clustering` block.
42+
This integrates with the component system you learned about in previous sections:
3243

3344
```alloy
3445
prometheus.scrape "default" {
46+
targets = discovery.kubernetes.pods.targets
47+
3548
clustering {
3649
enabled = true
3750
}
3851
39-
...
52+
forward_to = [prometheus.remote_write.default.receiver]
53+
}
54+
55+
prometheus.remote_write "default" {
56+
endpoint {
57+
url = "https://prometheus.example.com/api/v1/write"
58+
}
4059
}
4160
```
4261

43-
A cluster detects state changes when a node joins or leaves.
44-
All participating components locally recalculate target ownership and re-balance the number of targets they're scraping without explicitly communicating ownership over the network.
62+
When a cluster detects state changes (when a node joins or leaves), all participating components locally recalculate target ownership using a consistent hashing algorithm.
63+
Components re-balance the targets they're scraping without explicitly communicating ownership over the network.
64+
Each node uses 512 tokens in the hash ring for optimal load distribution.
4565

4666
Target auto-distribution lets you dynamically scale the number of {{< param "PRODUCT_NAME" >}} deployments to handle workload peaks.
47-
It also provides resiliency because one of the node peers automatically picks up targets if a node leaves.
67+
It also provides resiliency because remaining node peers automatically pick up targets if a node leaves the cluster.
4868

4969
{{< param "PRODUCT_NAME" >}} uses a local consistent hashing algorithm to distribute targets.
50-
On average, only ~1/N of the targets are redistributed.
70+
When the cluster size changes, this algorithm redistributes only approximately 1/N of the targets, minimizing disruption.
5171

5272
Refer to the component reference documentation to check if a component supports clustering, such as:
5373

@@ -58,31 +78,62 @@ Refer to the component reference documentation to check if a component supports
5878

5979
## Best practices
6080

81+
Follow these guidelines to ensure effective clustering in your {{< param "PRODUCT_NAME" >}} deployments.
82+
6183
### Avoid issues with disproportionately large targets
6284

63-
When your environment has a mix of very large and average-sized targets, avoid running too many cluster instances. While clustering generally does a good job of sharding targets to achieve balanced workload distribution, significant target size disparity can lead to uneven load distribution. When you have a few disproportionately large targets among many instances, the nodes assigned these large targets will experience much higher load compared to others (e.g. samples/second in case of Prometheus metrics), potentially causing uneven load balancing or hitting resource limitations. In these scenarios, it's often better to scale vertically rather than horizontally to reduce the impact of outlier large targets. This approach ensures more consistent resource utilization across your deployment and prevents overloading specific instances.
85+
When your environment has a mix of very large and average-sized targets, avoid running too many cluster instances.
86+
While clustering generally does a good job of sharding targets to achieve balanced workload distribution, significant target size disparity can lead to uneven load distribution.
87+
When you have a few disproportionately large targets among many instances, the nodes assigned these large targets experience much higher load compared to others, for example samples per second for Prometheus metrics, potentially causing uneven load balancing or hitting resource limitations.
88+
In these scenarios, it's often better to scale vertically rather than horizontally to reduce the impact of outlier large targets.
89+
This approach ensures more consistent resource utilization across your deployment and prevents overloading specific instances.
6490

6591
### Use `--cluster.wait-for-size`, but with caution
6692

67-
When using clustering in a deployment where a single instance cannot handle the entire load, it's recommended to use the `--cluster.wait-for-size` flag to ensure a minimum cluster size before accepting traffic. However, leave a significant safety margin when configuring this value by setting it significantly smaller than your typical expected operational number of instances. When this condition is not met, the instances will completely stop processing traffic in cluster-enabled components so it's important to leave room for any unexpected events.
93+
When you use clustering in a deployment where a single instance can't handle the entire load, use the `--cluster.wait-for-size` flag to ensure a minimum cluster size before accepting traffic.
94+
However, leave a significant safety margin when you configure this value by setting it significantly smaller than your typical expected operational number of instances.
95+
When this condition isn't met, the instances stop processing traffic in cluster-enabled components, so it's important to leave room for any unexpected events.
6896

69-
For example, if you're using Horizontal Pod Autoscalers (HPA) or PodDisruptionBudgets (PDB) in Kubernetes, ensure that the `--cluster.wait-for-size` flag is set to a value well below what your HPA and PDB minimums allow. This prevents traffic from stopping when Kubernetes instance counts temporarily drop below these thresholds during normal operations like pod termination or rolling updates.
97+
For example, if you're using Horizontal Pod Autoscalers (HPA) or PodDisruptionBudgets (PDB) in Kubernetes, set the `--cluster.wait-for-size` flag to a value well below what your HPA and PDB minimums allow.
98+
This prevents traffic from stopping when Kubernetes instance counts temporarily drop below these thresholds during normal operations like Pod termination or rolling updates.
7099

71-
We recommend to use the `--cluster.wait-timeout` flag to set a reasonable timeout for the waiting period to limit the impact of potential misconfiguration. The appropriate timeout duration should be based on how quickly you expect your orchestration or incident response team to provision required number of instances. Be aware that when timeout passes the cluster may be too small to handle traffic and run into further issues.
100+
It's recommended to use the `--cluster.wait-timeout` flag to set a reasonable timeout for the waiting period to limit the impact of potential misconfiguration.
101+
You can base the timeout duration on how quickly you expect your orchestration or incident response team to provision the required number of instances.
102+
Be aware that when the timeout passes, the cluster may be too small to handle traffic and can run into further issues.
72103

73-
### Do not enable clustering when you don't need it
104+
### Don't enable clustering if you don't need it
74105

75-
While clustering scales to very large numbers of instances, it introduces additional overhead in the form of logs, metrics, potential alerts, and processing requirements. If you're not using components that specifically support and benefit from clustering, it's best to not enable clustering at all. A particularly common mistake is enabling clustering on logs collecting DaemonSets. Collecting logs from mounted node's pod logs does not benefit from having clustering enabled since each instance typically collects logs only from its own node. In such cases, enabling clustering only adds unnecessary complexity and resource usage without providing functional benefits.
106+
While clustering scales to very large numbers of instances, it introduces additional overhead in the form of logs, metrics, potential alerts, and processing requirements.
107+
If you're not using components that specifically support and benefit from clustering, it's best to not enable clustering at all.
108+
A particularly common mistake is enabling clustering on logs collecting DaemonSets.
109+
Collecting logs from Pods on the mounted node doesn't benefit from having clustering enabled since each instance typically collects logs only from Pods on its own node.
110+
In such cases, enabling clustering only adds unnecessary complexity and resource usage without providing functional benefits.
76111

77112
## Cluster monitoring and troubleshooting
78113

79114
You can monitor your cluster status using the {{< param "PRODUCT_NAME" >}} UI [clustering page][].
80115
Refer to [Debug clustering issues][debugging] for additional troubleshooting information.
81116

117+
## Next steps
118+
119+
Now that you understand how clustering works with {{< param "PRODUCT_NAME" >}} components, explore these topics:
120+
121+
- [Deploy {{< param "PRODUCT_NAME" >}}][deploy] - Set up clustered deployments in production environments.
122+
- [Monitor {{< param "PRODUCT_NAME" >}}][monitor] - Learn about monitoring cluster health and performance.
123+
- [Troubleshooting][debugging] - Debug clustering issues and interpret cluster metrics.
124+
125+
For detailed configuration:
126+
127+
- [`alloy run` command reference][run] - Configure clustering using command-line flags.
128+
- [Component reference][components] - Explore clustering-enabled components like `prometheus.scrape` and `pyroscope.scrape`.
129+
82130
[run]: ../../reference/cli/run/#clustering
83131
[prometheus.scrape]: ../../reference/components/prometheus/prometheus.scrape/#clustering
84132
[pyroscope.scrape]: ../../reference/components/pyroscope/pyroscope.scrape/#clustering
85133
[prometheus.operator.podmonitors]: ../../reference/components/prometheus/prometheus.operator.podmonitors/#clustering
86134
[prometheus.operator.servicemonitors]: ../../reference/components/prometheus/prometheus.operator.servicemonitors/#clustering
87135
[clustering page]: ../../troubleshoot/debug/#clustering-page
88136
[debugging]: ../../troubleshoot/debug/#debug-clustering-issues
137+
[components]: ../../reference/components/
138+
[deploy]: ../../set-up/deploy/
139+
[monitor]: ../../monitor/

docs/sources/get-started/community_components.md

Lines changed: 0 additions & 23 deletions
This file was deleted.

0 commit comments

Comments
 (0)