You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/send-data/collect-from-other-data-sources/azure-blob-storage/append-blob/collect-logs.md
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,17 +20,17 @@ This section has instructions for configuring a pipeline for shipping logs avail
20
20
## Functional overview
21
21
22
22
1. You configure the Azure service to export logs to a container in a storage account created for that purpose.
23
-
1. The ARM template creates an Event Grid subscription with the storage container as publisher and the event hub (created by the Sumo Logic-provided ARM) as subscriber. Event Grid routes append blob creation events to event hub.
24
-
1. Event Hub streams the events to the AppendBlobFileTracker Azure function, to create an entry in FileOffSetMap table.
25
-
1. Periodically a Azure function named AppendBlobTaskProducer, fetches list of blobs from FileOffSetMap table and creates a task with metadata. (This is a JSON object that includes the start of the append blob, file path, container name and storage name). These tasks are then pushed to Azure Service Bus Queue.
26
-
1. The TaskConsumer Azure function, which is triggered when the service bus receives a new task, reads the append blob, from start byte, and sends that data to Sumo Logic.
23
+
1. The ARM template creates an Event Grid subscription with the storage container as publisher and the event hub (created by the Sumo Logic-provided ARM) as subscriber. Event Grid routes append blob creation events to the event hub.
24
+
1. Event Hub streams the events to the AppendBlobFileTracker Azure function, to create an entry in the FileOffSetMap table.
25
+
1. Periodically an Azure function named AppendBlobTaskProducer, fetches a list of blobs from the FileOffSetMap table and creates a task with metadata. (This is a JSON object that includes the start of the append blob, file path, container name, and storage name). These tasks are then pushed to the Azure Service Bus Queue.
26
+
1. The TaskConsumer Azure function, which is triggered when the service bus receives a new task, reads the append blob, from the start byte, and sends that data to Sumo Logic.
27
27
1. For more information about the solution strategy, see [Azure Blob Storage(append blob)](/docs/send-data/collect-from-other-data-sources/azure-blob-storage/append-blob).
28
28
29
29
## Step 1. Configure Azure storage account
30
30
31
31
In this step, you configure a storage account to which you will export monitoring data for your Azure service. The storage account must be a General-purpose v2 (GPv2) storage account.
32
32
33
-
If you have a storage account with a container you want to use for this purpose, make a note of its resource group, storage account name and container name, and proceed to [Step 2](#step-2-configure-an-http-source).
33
+
If you have a storage account with a container you want to use for this purpose, make a note of its resource group, storage account name, and container name, and proceed to [Step 2](#step-2-configure-an-http-source).
34
34
35
35
To configure an Azure storage account, do the following:
36
36
@@ -53,13 +53,13 @@ In this step, you configure an HTTP source to receive logs from the Azure functi
53
53
54
54
1. Select a hosted collector where you want to configure the HTTP source. If desired, create a new hosted collector, as described on [Configure a Hosted Collector and Source](/docs/send-data/hosted-collectors/configure-hosted-collector).
55
55
:::note
56
-
Make sure the hosted collector is tagged with tenant_name field for the out of thebox Azure apps to work. You can get the tenant name using the instructions [here](https://learn.microsoft.com/en-us/azure/active-directory-b2c/tenant-management-read-tenant-name#get-your-tenant-name).
56
+
Make sure the hosted collector is tagged with the tenant_name field for the out-of-the-box Azure apps to work. You can get the tenant name using the instructions [here](https://learn.microsoft.com/en-us/azure/active-directory-b2c/tenant-management-read-tenant-name#get-your-tenant-name).
57
57
:::
58
58
1. Configure an HTTP source as described in [HTTP Logs and Metrics Source](/docs/send-data/hosted-collectors/http-source/logs-metrics). Make a note of the URL for the source because you will need it in the next step.
59
59
60
60
## Step 3. Configure Azure resources using ARM template
61
61
62
-
In this step, you use a Sumo Logic-provided Azure Resource Manager (ARM) template to create an Event Hub, three Azure functions, Service Bus Queue, and a Storage Account.
62
+
In this step, you use a Sumo Logic-provided Azure Resource Manager (ARM) template to create an Event Hub, three Azure functions, a Service Bus Queue, and a Storage Account.
63
63
64
64
1. Download the [appendblobreaderdeploy.json](https://raw.githubusercontent.com/SumoLogic/sumologic-azure-function/master/AppendBlobReader/src/appendblobreaderdeploy.json) ARM template.
65
65
1. Click **Create a Resource**, search for **Template deployment** in the Azure Portal, and then click **Create.**
@@ -127,7 +127,7 @@ To authorize the App Service to list the storage account key, do the following:
127
127
128
128
### Step 2: Create an event grid subscription
129
129
130
-
This section provides instructions for creating an event grid subscription that subscribes all blob creation events to the Event Hub created by ARM template in [Step 3](#step-3-enabling-vnet-integration-optional) above.
130
+
This section provides instructions for creating an event grid subscription that subscribes all blob creation events to the Event Hub created by the ARM template in [Step 3](#step-3-enabling-vnet-integration-optional) above.
131
131
132
132
To create an event grid subscription, do the following:
133
133
@@ -156,7 +156,7 @@ To create an event grid subscription, do the following:
156
156
***Resource**. Select the Storage Account you configured, from where you want to ingest logs.
157
157
***System Topic Name**. Provide the topic name, if the system topic already exists then it will automatically select the existing topic.
158
158
:::note
159
-
If you do not see your configured Storage Account in the dropdown menu, make sure you met the requirements in [Requirements](#requirements) section.
159
+
If you do not see your configured Storage Account in the dropdown menu, make sure you meet the requirements in [Requirements](#requirements) section.
160
160
:::
161
161
162
162
1. Specify the following details for Event Types:
@@ -182,7 +182,7 @@ To create an event grid subscription, do the following:
182
182
1. Specify the following Filters tab options(Optional):
183
183
184
184
* Check **Enable subject filtering**.
185
-
* To filter events by container name, enter the following in the **Subject Begins With** field, replacing `<container_name>` with the name of the container from where you want to export logs: `/blobServices/default/containers/<container_name>/`
185
+
* To filter events by a container name, enter the following in the **Subject Begins With** field, replacing `<container_name>` with the name of the container from where you want to export logs: `/blobServices/default/containers/<container_name>/`
@@ -194,19 +194,19 @@ To create an event grid subscription, do the following:
194
194
195
195
This assumes that your storage account access is enabled for selected networks.
196
196
197
-
1. Create a subnet in a virtual network using the instructions in the [Azure documentation](https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-manage-subnet#add-a-subnet). If you have multiple accounts in the same region, you can skip step 2 below and use the same subnet and add it to the storage account as mentioned in step 3.
198
-
1. Perform below steps for BlobTaskConsumer function app:
197
+
1. Create a subnet in a virtual network using the instructions in the [Azure documentation](https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-manage-subnet#add-a-subnet). If you have multiple accounts in the same region, you can skip step 2 given below and use the same subnet and add it to the storage account as mentioned in step 3.
198
+
1. Perform the steps below for BlobTaskConsumer function app:
199
199
1. Go to **Function App > Settings > Networking**.
200
200
1. Under Outbound traffic, click on Vnet Integration. <br/>
201
201
1. Add the Vnet and subnet created in Step 1. <br/>
202
-
1. Also copy the outbound ip addresses you’ll need to add it in firewall configuration of your storage account. <br/> 
202
+
1. Also copy the outbound IP addresses you’ll need to add in the firewall configuration of your storage account. <br/> 
203
203
1. Go to your storage account from where you want to collect logs from. Go to Networking and add the same Vnet and subnet. <br/>
204
-
1. Add the outbound ip addresses (copied in step 2.iv) from both BlobTaskConsumer function under Firewall with each ip in a single row of Address range column.
204
+
1. Add the outbound IP addresses (copied in step 2.iv) from both BlobTaskConsumer functions under Firewall with each IP in a single row of Address range column.
205
205
1. Verify by going to the subnet. You should see Subnet delegation and service endpoints as shown in the screenshot below. <br/>
206
206
207
207
## Upgrading Azure Functions
208
208
209
-
1. Go to the resource group where ARM template was deployed and go to each of the function apps.
209
+
1. Go to the resource group where the ARM template was deployed and go to each of the function apps.
@@ -215,17 +215,17 @@ This assumes that your storage account access is enabled for selected networks.
215
215
216
216
## Azure Append Blob Limitations
217
217
218
-
1. By default the boundary regex used for json and log files are defined below. You can override it by updating `getBoundaryRegex` method of `AzureBlobTaskConsumer` function.
218
+
1. By default the boundary regex used for JSON and log files are defined below. You can override it by updating `getBoundaryRegex` method of `AzureBlobTaskConsumer` function.
1. By default, it's assumed that after 48 hours the log file won't be updated. You can override it by setting `MAX_LOG_FILE_ROLLOVER_HOURS` setting in `AppendBlobTaskProducer` function.
222
-
1. By default batch size is automatically calculated based on number of files present in the storage account and maximum batch size can be 200MB.
223
-
1.`AppendBlobTaskProducer` function sets the lock (for max 30min) and creates the task (in service bus) for the file, and then automatically releases it if `AppendBlobTaskConsumer` fails to process it, if you are seeing queueing delay of more than 30min you can increase `maxlockThresholdMin` in `getLockedEntitiesExceedingThreshold` method of `AppendBlobTaskProducer` function.
222
+
1. By default batch size is automatically calculated based on the number of files present in the storage account and the maximum batch size can be 200MB.
223
+
1.`AppendBlobTaskProducer` function sets the lock (for max 30min) and creates the task (in service bus) for the file, and then automatically releases it if `AppendBlobTaskConsumer` fails to process it if you are seeing queueing delay of more than 30min you can increase `maxlockThresholdMin` in `getLockedEntitiesExceedingThreshold` method of `AppendBlobTaskProducer` function.
224
224
1. Log files have a file extension of .json (JSONLines), .blob(JSONLines), .csv, .txt or .log.
225
225
* If the file is .json or .blob, the JSON objects are extracted and sent to Sumo Logic.
226
-
* If the file is .log, .txt or .csv, log lines are sent to Sumo Logic as-is.
227
-
1. By default all the data is ingested to single HTTP source if you want to send data to multiple source (recommended in case of different log formats) you can override the `getSumoEndpoint` function in `AppendBlobTaskConsumer` function.
228
-
1. Blob file name present in `_sourceName` metadata will be truncated if it exceeds 128 char.
226
+
* If the file is .log, .txt, or .csv, log lines are sent to Sumo Logic as-is.
227
+
1. By default all the data is ingested to a single HTTP source if you want to send data to multiple sources (recommended in case of different log formats) you can override the `getSumoEndpoint` function in `AppendBlobTaskConsumer` function.
228
+
1. Blob file name present in `_sourceName` metadata will be truncated if it exceeds 128 chars.
0 commit comments