You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/operator-insights/ingestion-agent-configuration-reference.md
+12-10Lines changed: 12 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ ms.service: operator-insights
8
8
ms.topic: conceptual
9
9
ms.date: 12/06/2023
10
10
---
11
+
11
12
# Configuration reference for Azure Operator Insights ingestion agent
12
13
13
14
This reference provides the complete set of configuration for the [Azure Operator Insights ingestion agent](ingestion-agent-overview.md), listing all fields with explanatory comments.
@@ -22,12 +23,12 @@ This reference shows two pipelines: one with an MCC EDR source and one with an S
22
23
23
24
```yaml
24
25
# A unique identifier for this agent instance. Reserved URL characters must be percent-encoded. It's included in the upload path to the Data Product's input storage account.
25
-
agent_id: agent01
26
+
agent_id: agent01
26
27
# Config for secrets providers. We support reading secrets from Azure Key Vault and from the VM's local filesystem.
27
28
# Multiple secret providers can be defined and each must be given a unique name, which is referenced later in the config.
28
29
# A secret provider of type `key_vault` which contains details required to connect to the Azure Key Vault and allow connection to the Data Product's input storage account. This is always required.
29
30
# A secret provider of type `file_system`, which specifies a directory on the VM where secrets are stored. For example for an SFTP pull source, for storing credentials for connecting to an SFTP server.
30
-
secret_providers:
31
+
secret_providers:
31
32
- name: data_product_keyvault_mi
32
33
key_vault:
33
34
vault_name: contoso-dp-kv
@@ -73,7 +74,7 @@ sink:
73
74
# Optional A string giving an optional base path to use in the container in the Data Product's input storage account. Reserved URL characters must be percent-encoded. See the Data Product for what value, if any, is required.
74
75
base_path: base-path
75
76
sas_token:
76
-
# This must reference a secret provider configured above.
77
+
# This must reference a secret provider configured above.
77
78
secret_provider: data_product_keyvault_mi
78
79
# The name of a secret in the corresponding provider.
79
80
# This will be the name of a secret in the Key Vault.
@@ -102,13 +103,13 @@ source:
102
103
mcc_edrs:
103
104
# The maximum amount of data to buffer in memory before uploading. Units are B, KiB, MiB, GiB, etc.
104
105
message_queue_capacity: 32 MiB
105
-
# Quick check on the maximum RAM that the agent should use.
106
-
# This is a guide to check the other tuning parameters, rather than a hard limit.
106
+
# Quick check on the maximum RAM that the agent should use.
107
+
# This is a guide to check the other tuning parameters, rather than a hard limit.
107
108
maximum_overall_capacity: 1216 MiB
108
109
listener:
109
110
# The TCP port to listen on. Must match the port MCC is configured to send to. Defaults to 36001.
110
111
port: 36001
111
-
# EDRs greater than this size are dropped. Subsequent EDRs continue to be processed.
112
+
# EDRs greater than this size are dropped. Subsequent EDRs continue to be processed.
112
113
# This condition likely indicates MCC sending larger than expected EDRs. MCC is not normally expected
113
114
# to send EDRs larger than the default size. If EDRs are being dropped because of this limit,
114
115
# investigate and confirm that the EDRs are valid, and then increase this value. Units are B, KiB, MiB, GiB, etc.
@@ -118,7 +119,7 @@ source:
118
119
# corrupt EDRs to Azure. You should not need to change this value. Units are B, KiB, MiB, GiB, etc.
119
120
hard_maximum_message_size: 100000 B
120
121
batching:
121
-
# The maximum size of a single blob (file) to store in the Data Product's input storage account.
122
+
# The maximum size of a single blob (file) to store in the Data Product's input storage account.
122
123
maximum_blob_size: 128 MiB. Units are B, KiB, MiB, GiB, etc.
123
124
# The maximum time to wait when no data is received before uploading pending batched data to the Data Product's input storage account. Examples: 30s, 10m, 1h, 1d.
124
125
blob_rollover_period: 5m
@@ -149,16 +150,17 @@ source:
149
150
# Only for use with password authentication. The name of the file containing the password in the secrets_directory folder
150
151
secret_name: sftp-user-password
151
152
# Only for use with private key authentication. The name of the file containing the SSH key in the secrets_directory folder
152
-
key_secret: sftp-user-ssh-key
153
+
key_secret_name: sftp-user-ssh-key
153
154
# Optional. Only for use with private key authentication. The passphrase for the SSH key. This can be omitted if the key is not protected by a passphrase.
# The path to a folder on the SFTP server that files will be uploaded to Azure Operator Insights from.
157
158
base_path: /path/to/sftp/folder
158
159
# Optional. A regular expression to specify which files in the base_path folder should be ingested. If not specified, the agent will attempt to ingest all files in the base_path folder (subject to exclude_pattern, settling_time and exclude_before_time).
159
-
include_pattern: "*\.csv$"
160
+
include_pattern: ".*\.csv$"# Only include files which end in ".csv"
160
161
# Optional. A regular expression to specify any files in the base_path folder which should not be ingested. Takes priority over include_pattern, so files which match both regular expressions will not be ingested.
161
-
exclude_pattern: '\.backup$'
162
+
# The exclude_pattern can also be used to ignore whole directories, but the pattern must still match all files under that directory. e.g. `^excluded-dir/.*$` or `^excluded-dir/` but *not* `^excluded-dir$`
163
+
exclude_pattern: "^\.staging/|\.backup$"# Exclude all file paths that start with ".staging/" or end in ".backup"
162
164
# A duration, such as "10s", "5m", "1h".. During an upload run, any files last modified within the settling time are not selected for upload, as they may still be being modified.
163
165
settling_time: 1m
164
166
# Optional. A datetime that adheres to the RFC 3339 format. Any files last modified before this datetime will be ignored.
Copy file name to clipboardExpand all lines: articles/operator-insights/monitor-troubleshoot-ingestion-agent.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,6 +32,8 @@ Metrics are reported in a simple human-friendly form.
32
32
33
33
To collect a diagnostics package, SSH to the Virtual Machine and run the command `/usr/bin/microsoft/az-aoi-ingestion-gather-diags`. This command generates a date-stamped zip file in the current directory that you can copy from the system.
34
34
35
+
If you have configured collection of logs through the Azure Monitor agent, you can view ingestion agent logs in the portal view of your Log Analytics workspace, and may not need to collect a diagnostics package to debug your issues.
36
+
35
37
> [!NOTE]
36
38
> Microsoft Support might request diagnostics packages when investigating an issue. Diagnostics packages don't contain any customer data or the value of any credentials.
37
39
@@ -117,6 +119,7 @@ Symptoms: No data appears in Azure Data Explorer. Logs of category `Ingestion` d
117
119
118
120
- Check that the agent is running on all VMs and isn't reporting errors in logs.
119
121
- Check that files exist in the correct location on the SFTP server, and that they aren't being excluded due to file source config (see [Files are missing](#files-are-missing)).
122
+
- Ensure that the configured SFTP user can read all directories under the `base_path`, which file source config doesn't exclude.
120
123
- Check the network connectivity and firewall configuration between the ingestion agent VM and the Data Product's input storage account.
Copy file name to clipboardExpand all lines: articles/operator-insights/set-up-ingestion-agent.md
+13-4Lines changed: 13 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -139,6 +139,9 @@ On the SFTP server:
139
139
140
140
1. Ensure port 22/TCP to the VM is open.
141
141
1. Create a new user, or determine an existing user on the SFTP server that the ingestion agent should use to connect to the SFTP server.
142
+
- By default the ingestion agent searches every directory under the base path, so this user must be able to read all of them. Any directories that the user does not have permission to access must be excluded using the `exclude_pattern` configuration.
143
+
> [!Note]
144
+
> Implicitly excluding directories by not specifying them in the included pattern is not sufficient to stop the agent searching those directories. See [the configuration reference](ingestion-agent-configuration-reference.md) for more detail on excluding directories.
142
145
1. Determine the authentication method that the ingestion agent should use to connect to the SFTP server. The agent supports:
143
146
- Password authentication
144
147
- SSH key authentication
@@ -277,7 +280,12 @@ The configuration you need is specific to the type of source and your Data Produ
277
280
- `user`: the name of the user on the SFTP server that the agent should use to connect.
278
281
- Depending on the method of authentication you chose in [Prepare the VMs](#prepare-the-vms), set either `password` or `private_key`.
279
282
- For password authentication, set `secret_name` to the name of the file containing the password in the `secrets_directory` folder.
280
-
- For SSH key authentication, set `key_secret` to the name of the file containing the SSH key in the `secrets_directory` folder. If the private key is protected with a passphrase, set `passphrase_secret_name` to the name of the file containing the passphrase in the `secrets_directory` folder.
283
+
- For SSH key authentication, set `key_secret_name` to the name of the file containing the SSH key in the `secrets_directory` folder. If the private key is protected with a passphrase, set `passphrase_secret_name` to the name of the file containing the passphrase in the `secrets_directory` folder.
284
+
- All secret files should have permissions of `600` (`rw-------`), and an owner of `az-aoi-ingestion` so only the ingestion agent and privileged users can read them.
285
+
```
286
+
sudo chmod 600 <secrets_directory>/*
287
+
sudo chown az-aoi-ingestion <secrets_directory>/*
288
+
```
281
289
282
290
For required or recommended values for other fields, refer to the documentation for your Data Product.
283
291
@@ -327,11 +335,12 @@ If you're running the ingestion agent on an Azure VM or on an on-premises VM con
327
335
To collect ingestion agent logs, follow [the Azure Monitor documentation to install the Azure Monitor Agent and configure log collection](../azure-monitor/agents/data-collection-text-log.md).
328
336
329
337
- These docs use the Az PowerShell module to create a logs table. Follow the [Az PowerShell module install documentation](/powershell/azure/install-azure-powershell) first.
330
-
- The `YourOptionalColumn` section from the sample `$tableParams` JSON is unnecessary for the ingestion agent, and can be removed.
338
+
- The `YourOptionalColumn` section from the sample `$tableParams` JSON is unnecessary for the ingestion agent, and can be removed.
331
339
- When adding a data source to your data collection rule, add a `Custom Text Logs` source type, with file pattern `/var/log/az-aoi-ingestion/stdout.log`.
332
-
- After adding the data collection rule, you can query these logs through the Log Analytics workspace. Use the following query to make them easier to work with:
340
+
- We also recommend following [the documentation to add a `Linux Syslog` Data source](../azure-monitor/agents/data-collection-syslog.md) to your data collection rule, to allow for auditing of all processes running on the VM.
341
+
- After adding the data collection rule, you can query the ingestion agent logs through the Log Analytics workspace. Use the following query to make them easier to work with:
333
342
```
334
-
RawAgentLogs_CL
343
+
<CustomTableName>
335
344
| extend RawData = replace_regex(RawData, '\\x1b\\[\\d{1,4}m', '') // Remove any color tags
336
345
| parse RawData with TimeGenerated: datetime ' ' Level ' ' Message // Parse the log lines into the TimeGenerated, Level and Message columns for easy filtering
0 commit comments