Skip to content

Commit c55139d

Browse files
authored
Merge pull request #104714 from HeidiSteen/heidist-monitor
[Azure Cognitive Search] logging updates
2 parents d0ccc41 + e7302d7 commit c55139d

File tree

8 files changed

+108
-66
lines changed

8 files changed

+108
-66
lines changed

articles/search/TOC.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -323,10 +323,10 @@
323323
items:
324324
- name: Fundamentals
325325
href: search-monitor-usage.md
326-
- name: Monitor query activity
327-
href: search-monitor-queries.md
328326
- name: Diagnostic logging
329327
href: search-monitor-logs.md
328+
- name: Monitor query activity
329+
href: search-monitor-queries.md
330330
- name: Search traffic analytics
331331
href: search-traffic-analytics.md
332332
- name: Troubleshoot
35.5 KB
Loading
42.2 KB
Loading
21.2 KB
Loading

articles/search/search-capacity-planning.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,11 @@ ms.date: 02/14/2020
1313

1414
# Adjust capacity in Azure Cognitive Search
1515

16-
Before [provisioning a search service](search-create-service-portal.md) and locking in a specific pricing tier, take a few minutes to understand the role of replicas and partitions in a service, whether you need proportionally larger or faster partitions, and how you might configure the service for expected load.
16+
Before [provisioning a search service](search-create-service-portal.md) and locking in a specific pricing tier, take a few minutes to understand the role of replicas and partitions in a service and how you might adjust a service to accommodate spikes and dips in resource demand.
1717

18-
Capacity is a function of the [tier you choose](search-sku-tier.md) (tiers determine hardware characteristics), and the replica and partition combination necessary for projected workloads. This article focuses on replica and partition combinations and interactions.
18+
Capacity is a function of the [tier you choose](search-sku-tier.md) (tiers determine hardware characteristics), and the replica and partition combination necessary for projected workloads. Depending on the tier and the size of the adjustment, adding or reducing capacity can take anywhere from 15 minutes to several hours.
19+
20+
When modifying the allocation of replicas and partitions, we recommend using the Azure portal. The portal enforces limits on allowable combinations that stay below maximum limits of a tier. However, if you require a script-based or code-based provisioning approach, the [Azure PowerShell](search-manage-powershell.md) or the [Management REST API](https://docs.microsoft.com/rest/api/searchmanagement/services) are alternative solutions.
1921

2022
## Terminology: replicas and partitions
2123

@@ -24,16 +26,18 @@ Capacity is a function of the [tier you choose](search-sku-tier.md) (tiers deter
2426
|*Partitions* | Provides index storage and I/O for read/write operations (for example, when rebuilding or refreshing an index). Each partition has a share of the total index. If you allocate three partitions, your index is divided into thirds. |
2527
|*Replicas* | Instances of the search service, used primarily to load balance query operations. Each replica is one copy of an index. If you allocate three replicas, you'll have three copies of an index available for servicing query requests.|
2628

27-
## How to allocate replicas and partitions
29+
## When to add nodes
2830

2931
Initially, a service is allocated a minimal level of resources consisting of one partition and one replica.
3032

31-
A single service must have sufficient resources to handle all workloads (indexing and queries). Neither workload runs in the background. You can schedule indexing for times when query requests are naturally less frequent, but the service will not otherwise prioritize one task over another.
32-
33-
When modifying the allocation of replicas and partitions, we recommend using the Azure portal. The portal enforces limits on allowable combinations that stay below maximum limits of a tier. However, if you require a script-based or code-based provisioning approach, the [Azure PowerShell](search-manage-powershell.md) or the [Management REST API](https://docs.microsoft.com/rest/api/searchmanagement/services) are alternative solutions.
33+
A single service must have sufficient resources to handle all workloads (indexing and queries). Neither workload runs in the background. You can schedule indexing for times when query requests are naturally less frequent, but the service will not otherwise prioritize one task over another. Additionally, a certain amount of redundancy smooths out query performance when services or nodes are updated internally.
3434

3535
As a general rule, search applications tend to need more replicas than partitions, particularly when the service operations are biased toward query workloads. The section on [high availability](#HA) explains why.
3636

37+
Adding more replicas or partitions increases your cost of running the service. Be sure to check the [pricing calculator](https://azure.microsoft.com/pricing/calculator/) to understand the billing implications of adding more nodes. The [chart below](#chart) can help you cross-reference the number of search units required for a specific configuration.
38+
39+
## How to allocate replicas and partitions
40+
3741
1. Sign in to the [Azure portal](https://portal.azure.com/) and select the search service.
3842

3943
1. In **Settings**, open the **Scale** page to modify replicas and partitions.
@@ -110,15 +114,15 @@ Currently, there is no built-in mechanism for disaster recovery. Adding partitio
110114

111115
## Estimate replicas
112116

113-
On a production service, you should allocate three replicas for SLA purposes. If you experience slow query performance, one remedy is to add replicas so that additional copies of the index are brought online to support bigger query workloads and to load balance the requests over the multiple replicas.
117+
On a production service, you should allocate three replicas for SLA purposes. If you experience slow query performance, you can add replicas so that additional copies of the index are brought online to support bigger query workloads and to load balance the requests over the multiple replicas.
114118

115119
We do not provide guidelines on how many replicas are needed to accommodate query loads. Query performance depends on the complexity of the query and competing workloads. Although adding replicas clearly results in better performance, the result is not strictly linear: adding three replicas does not guarantee triple throughput.
116120

117121
For guidance in estimating QPS for your solution, see [Scale for performance](search-performance-optimization.md)and [Monitor queries](search-monitor-queries.md)
118122

119123
## Estimate partitions
120124

121-
The [tier you choose](search-sku-tier.md) determines partition size and speed, and each tier is optimized around a set of characteristics that fit various scenarios. If you choose a higher-end tier, you might need fewer partitions than if you go with S1.
125+
The [tier you choose](search-sku-tier.md) determines partition size and speed, and each tier is optimized around a set of characteristics that fit various scenarios. If you choose a higher-end tier, you might need fewer partitions than if you go with S1. One of the questions you'll need to answer through self-directed testing is whether a larger and more expensive partition yields better performance than two cheaper partitions on a service provisioned at a lower tier.
122126

123127
Search applications that require near real-time data refresh will need proportionally more partitions than replicas. Adding partitions spreads read/write operations across a larger number of compute resources. It also gives you more disk space for storing additional indexes and documents.
124128

articles/search/search-monitor-logs.md

Lines changed: 41 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,22 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 02/11/2020
11+
ms.date: 02/18/2020
1212
---
1313

1414
# Collect and analyze log data for Azure Cognitive Search
1515

16-
Diagnostic or operational logs provide insight into the detailed operations of Azure Cognitive Search and are useful for monitoring service and workload processes. Internally, logs exist on the backend for a short period of time, sufficient for investigation and analysis if you file a support ticket. However, if you want self-direction over operational data, you should configure a diagnostic setting to specify where logging information is collected.
16+
Diagnostic or operational logs provide insight into the detailed operations of Azure Cognitive Search and are useful for monitoring service and workload processes. Internally, logs exist on the backend for a short period of time, sufficient for investigation and analysis if you file a support ticket. However, if you want self-direction over operational data, you should configure a diagnostic setting to specify where logging information is collected.
1717

1818
Setting up logs is useful for diagnostics and preserving operational history. After you enable logging, you can run queries or build reports for structured analysis.
1919

2020
The following table enumerates options for collecting and persisting data.
2121

2222
| Resource | Used for |
2323
|----------|----------|
24-
| [Send to Log Analytics workspace](https://docs.microsoft.com/azure/azure-monitor/learn/tutorial-resource-logs) | Logged events and query metrics, based on the schemas below. Events are logged to a Log Analytics workspace. Using Log Analytics, you can run queries to return detailed information. For more information, see [Get started with Azure Monitor logs](https://docs.microsoft.com/azure/azure-monitor/learn/tutorial-viewdata) |
25-
| [Archive with Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-overview) | Logged events and query metrics, based on the schemas below. Events are logged to a Blob container and stored in JSON files. Logs can be quite granular (by the hour/minute), useful for researching a specific incident but not for open-ended investigation. Use a JSON editor to view a log file.|
26-
| [Stream to Event Hub](https://docs.microsoft.com/azure/event-hubs/) | Logged events and query metrics, based on the schemas documented in this article. Choose this as an alternative data collection service for very large logs. |
24+
| [Send to Log Analytics workspace](https://docs.microsoft.com/azure/azure-monitor/learn/tutorial-resource-logs) | Events and metrics are sent to a Log Analytics workspace, which can be queried in the portal to return detailed information. For an introduction, see [Get started with Azure Monitor logs](https://docs.microsoft.com/azure/azure-monitor/learn/tutorial-viewdata) |
25+
| [Archive with Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-overview) | Events and metrics are archived to a Blob container and stored in JSON files. Logs can be quite granular (by the hour/minute), useful for researching a specific incident but not for open-ended investigation. Use a JSON editor to view a raw log file or Power BI to aggregate and visualize log data.|
26+
| [Stream to Event Hub](https://docs.microsoft.com/azure/event-hubs/) | Events and metrics are streamed to an Azure Event Hubs service. Choose this as an alternative data collection service for very large logs. |
2727

2828
Both Azure Monitor logs and Blob storage are available as a free service so that you can try it out at no charge for the lifetime of your Azure subscription. Application Insights is free to sign up and use as long as application data size is under certain limits (see the [pricing page](https://azure.microsoft.com/pricing/details/monitor/) for details).
2929

@@ -33,38 +33,59 @@ If you are using Log Analytics or Azure Storage, you can create resources in adv
3333

3434
+ [Create a log analytics workspace](https://docs.microsoft.com/azure/azure-monitor/learn/quick-create-workspace)
3535

36-
+ [Create a storage account](https://docs.microsoft.com/azure/storage/common/storage-quickstart-create-account) if you require a log archive.
36+
+ [Create a storage account](https://docs.microsoft.com/azure/storage/common/storage-quickstart-create-account)
3737

38-
## Create a log
38+
## Enable data collection
3939

40-
Diagnostic settings define data collection. A setting specifies how and what is collected.
40+
Diagnostic settings specify how logged events and metrics are collected.
4141

4242
1. Under **Monitoring**, select **Diagnostic settings**.
4343

4444
![Diagnostic settings](./media/search-monitor-usage/diagnostic-settings.png "Diagnostic settings")
4545

4646
1. Select **+ Add diagnostic setting**
4747

48-
1. Choose the data you want to export: Logs, Metrics or both. You can collect data in a storage account, a log analytics workspace, or stream it to Event Hub.
49-
50-
Log analytics is recommended because you can query the workspace in the portal.
51-
52-
If you are also using Blob storage, containers and blobs will be created as-needed when log data is exported.
48+
1. Check **Log Analytics**, select your workspace, and select **OperationLogs** and **AllMetrics**.
5349

5450
![Configure data collection](./media/search-monitor-usage/configure-storage.png "Configure data collection")
5551

5652
1. Save the setting.
5753

58-
1. Test by creating or deleting objects (creates log events) and by submitting queries (generates metrics).
54+
1. After logging has been enabled, use your search service to start generating logs and metrics. It will take time before logged events and metrics become available.
55+
56+
For Log Analytics, it will be several minutes before data is available, after which you can run Kusto queries to return data. For more information, see [Monitor query requests](search-monitor-logs.md).
57+
58+
For Blob storage, it takes one hour before the containers will appear in Blob storage. There is one blob, per hour, per container. Containers are only created when there is an activity to log or measure. When the data is copied to a storage account, the data is formatted as JSON and placed in two containers:
59+
60+
+ insights-logs-operationlogs: for search traffic logs
61+
+ insights-metrics-pt1m: for metrics
62+
63+
## Query log information
64+
65+
In diagnostic logs, two tables contain logs and metrics for Azure Cognitive Search: **AzureDiagnostics** and **AzureMetrics**.
66+
67+
1. Under **Monitoring**, select **Logs**.
68+
69+
1. Enter **AzureMetrics** in the query window. Run this simple query to get acquainted with the data collected in this table. Scroll across the table to view metrics and values. Notice the record count at the top, and if your service has been collecting metrics for a while, you might want to adjust the time interval to get a manageable data set.
70+
71+
![AzureMetrics table](./media/search-monitor-usage/azuremetrics-table.png "AzureMetrics table")
72+
73+
1. Enter the following query to return a tabular result set.
5974

60-
In Blob storage, containers are only created when there is an activity to log or measure. When the data is copied to a storage account, the data is formatted as JSON and placed in two containers:
75+
```
76+
AzureMetrics
77+
| project MetricName, Total, Count, Maximum, Minimum, Average
78+
```
6179

62-
* insights-logs-operationlogs: for search traffic logs
63-
* insights-metrics-pt1m: for metrics
80+
1. Repeat the previous steps, starting with **AzureDiagnostics** to return all columns for informational purposes, followed by a more selective query that extracts more interesting information.
6481

65-
**It takes one hour before the containers will appear in Blob storage. There is one blob, per hour, per container.**
82+
```
83+
AzureDiagnostics
84+
| project OperationName, resultSignature_d, DurationMs, Query_s, Documents_d, IndexName_s
85+
| where OperationName == "Query.Search"
86+
```
6687

67-
Logs are archived for every hour in which activity occurs. The following path is an example of one log file created on January 12 2020 at 9:00 a.m. where each `/` is a folder: `resourceId=/subscriptions/<subscriptionID>/resourcegroups/<resourceGroupName>/providers/microsoft.search/searchservices/<searchServiceName>/y=2020/m=01/d=12/h=09/m=00/name=PT1H.json`
88+
![AzureDiagnostics table](./media/search-monitor-usage/azurediagnostics-table.png "AzureDiagnostics table")
6889

6990
## Log schema
7091

@@ -119,7 +140,7 @@ For the **Search Queries Per Second** metric, minimum is the lowest value for se
119140

120141
For **Throttled Search Queries Percentage**, minimum, maximum, average and total, all have the same value: the percentage of search queries that were throttled, from the total number of search queries during one minute.
121142

122-
## View log files
143+
## View raw log files
123144

124145
Blob storage is used for archiving log files. You can use any JSON editor to view the log file. If you don't have one, we recommend [Visual Studio Code](https://code.visualstudio.com/download).
125146

0 commit comments

Comments
 (0)