You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-indexer-securing-resources.md
+60-71Lines changed: 60 additions & 71 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,13 +15,13 @@ ms.date: 05/01/2024
15
15
16
16
# Indexer access to content protected by Azure network security
17
17
18
-
If your search solution requirements include an Azure virtual network, this concept article explains how a search indexer can access content that's protected by network security. It describes the outbound traffic patterns and indexer execution environments. It also covers the network protections supported by Azure AI Search and factors that might influence your security strategy. Finally, because Azure Storage is used for both data access and persistent storage, this article also covers network considerations that are specific to search and storage connectivity.
18
+
If your Azure resources are deployed in an Azure virtual network, this concept article explains how a search indexer can access content that's protected by network security. It describes the outbound traffic patterns and indexer execution environments. It also covers the network protections supported by Azure AI Search and factors that might influence your security strategy. Finally, because Azure Storage is used for both data access and persistent storage, this article also covers network considerations that are specific to [search and storage connectivity](#access-to-a-network-protected-storage-account).
19
19
20
20
Looking for step-by-step instructions instead? See [How to configure firewall rules to allow indexer access](search-indexer-howto-access-ip-restricted.md) or [How to make outbound connections through a private endpoint](search-indexer-howto-access-private.md).
21
21
22
22
## Resources accessed by indexers
23
23
24
-
Azure AI Search indexers can make outbound calls to various Azure resources during execution. An indexer makes outbound calls in three situations:
24
+
Azure AI Search indexers can make outbound calls to various Azure resources in three situations:
25
25
26
26
- Connections to external data sources during indexing
27
27
- Connections to external, encapsulated code through a skillset that includes custom skills
@@ -42,6 +42,15 @@ A list of all possible Azure resource types that an indexer might access in a ty
42
42
> [!NOTE]
43
43
> An indexer also connects to Azure AI services for built-in skills. However, that connection is made over the internal network and isn't subject to any network provisions under your control.
44
44
45
+
Indexers connect to resources using the following approaches:
46
+
47
+
- A public endpoint with credentials
48
+
- A private endpoint, using Azure Private Link
49
+
- Connect as a trusted service
50
+
- Connect through IP addressing
51
+
52
+
If your Azure resource is on a virtual network, you should use either a private endpoint or IP addressing to admit indexer connections to the data.
53
+
45
54
## Supported network protections
46
55
47
56
Your Azure resources could be protected using any number of the network isolation mechanisms offered by Azure. Depending on the resource and region, Azure AI Search indexers can make outbound connections through IP firewalls and private endpoints, subject to the limitations indicated in the following table.
@@ -60,29 +69,28 @@ Your Azure resources could be protected using any number of the network isolatio
60
69
61
70
## Indexer execution environment
62
71
63
-
Azure AI Search has the concept of an *indexer execution environment* that optimizes processing based on the characteristics of the job. There are two environments. If you're using an IP firewall to control access to Azure resources, knowing about execution environments will help you set up an IP range that is inclusive of both.
64
-
65
-
For any given indexer run, Azure AI Search determines the best environment in which to run the indexer. Depending on the number and types of tasks assigned, the indexer will run in one of two environments:
66
-
67
-
- A *private execution environment* that's internal to a search service.
72
+
Azure AI Search has the concept of an *indexer execution environment* that optimizes processing based on the characteristics of the job. There are two environments. If you're using an IP firewall to control access to Azure resources, knowing about execution environments will help you set up an IP range that is inclusive of both environments.
68
73
69
-
Indexers running in the private environment share computing resources with other indexing and query workloads on the same search service. Typically, only indexers that perform text-based indexing (without skillsets) run in this environment.
70
-
71
-
- A *multitenant environment* that's managed and secured by Microsoft at no extra cost. It isn't subject to any network provisions under your control.
72
-
73
-
This environment is used to offload computationally intensive processing, leaving service-specific resources available for routine operations. Examples of resource-intensive indexer jobs include attaching skillsets, processing large documents, or processing a high volume of documents.
74
+
For any given indexer run, Azure AI Search determines the best environment in which to run the indexer. Depending on the number and types of tasks assigned, the indexer will run in one of two environments.
74
75
76
+
| Execution environment | Description |
77
+
|-----------------------|-------------|
78
+
| Private | Internal to a search service. Indexers running in the private environment share computing resources with other indexing and query workloads on the same search service. Typically, only indexers that perform text-based indexing (without skillsets) run in this environment. If you set up a private connection between an indexer and your data, this is the only execution enriovnment you can use. |
79
+
| multitenant | Managed and secured by Microsoft at no extra cost. It isn't subject to any network provisions under your control. This environment is used to offload computationally intensive processing, leaving service-specific resources available for routine operations. Examples of resource-intensive indexer jobs include attaching skillsets, processing large documents, or processing a high volume of documents. |
80
+
75
81
The following section explains the IP configuration for admitting requests from either execution environment.
76
82
77
83
### Setting up IP ranges for indexer execution
78
84
79
-
If the Azure resource that provides source data exists behind a firewall, you need[inbound rules that admit indexer connections](search-indexer-howto-access-ip-restricted.md) for all of the IPs from which an indexer request can originate. The IPs include the one used by the search service and the multitenant environment.
85
+
If your Azure resource is behind a firewall, set up[inbound rules that admit indexer connections](search-indexer-howto-access-ip-restricted.md) for all of the IPs from which an indexer request can originate. This includes the IP address used by the search service, and the IP addresses used by the multitenant environment.
80
86
81
87
- To obtain the IP address of the search service (and the private execution environment), use `nslookup` (or `ping`) to find the fully qualified domain name (FQDN) of your search service. The FQDN of a search service in the public cloud would be `<service-name>.search.windows.net`.
82
88
83
89
- To obtain the IP addresses of the multitenant environments within which an indexer might run, use the `AzureCognitiveSearch` service tag.
84
90
85
-
[Azure service tags](../virtual-network/service-tags-overview.md) have a published range of IP addresses for each service. You can find these IPs using the [discovery API](../virtual-network/service-tags-overview.md#use-the-service-tag-discovery-api) or a [downloadable JSON file](../virtual-network/service-tags-overview.md#discover-service-tags-by-using-downloadable-json-files). IP ranges are allocated by region, so check your search service region before you start.
91
+
[Azure service tags](../virtual-network/service-tags-overview.md) have a published range of IP addresses of the multitenant environments for each region. You can find these IPs using the [discovery API](../virtual-network/service-tags-overview.md#use-the-service-tag-discovery-api) or a [downloadable JSON file](../virtual-network/service-tags-overview.md#discover-service-tags-by-using-downloadable-json-files). IP ranges are allocated by region, so check your search service region before you start.
92
+
93
+
#### Setting up IP rules for Azure SQL
86
94
87
95
When setting the IP rule for the multitenant environment, certain SQL data sources support a simple approach for IP address specification. Instead of enumerating all of the IP addresses in the rule, you can create a [Network Security Group rule](../virtual-network/network-security-groups-overview.md) that specifies the `AzureCognitiveSearch` service tag.
88
96
@@ -94,84 +102,65 @@ You can specify the service tag if your data source is either:
94
102
95
103
Notice that if you specified the service tag for the multitenant environment IP rule, you'll still need an explicit inbound rule for the private execution environment (meaning the search service itself), as obtained through `nslookup`.
96
104
97
-
## Supplement network security with authorization
98
-
99
-
Firewalls and network security are a first step in preventing unauthorized access to data and operations. Authorization should be your next step.
100
-
101
-
We recommend role-based access, where Microsoft Entra ID users and groups are assigned to roles that determine read and write access to your service. See [Connect to Azure AI Search using role-based access controls](search-security-rbac.md) for a description of built-in roles and instructions for creating custom roles.
102
-
103
-
If you don't need key-based authentication, we recommend that you disable API keys and use role assignments exclusively.
104
-
105
-
## Choosing a connectivity approach
106
-
107
-
When integrating Azure AI Search into a solution that runs on a virtual network, consider the following constraints:
108
-
109
-
- An indexer can't make a direct connection to a [virtual network service endpoint](../virtual-network/virtual-network-service-endpoints-overview.md). Public endpoints with credentials, private endpoints, trusted service, and IP addressing are the only supported methodologies for indexer connections.
105
+
## Choose a connectivity approach
110
106
111
-
-A search service always runs in the cloud and can't be provisioned into a specific virtual network, running natively on a virtual machine. This functionality won't be offered by Azure AI Search.
107
+
A search service can't be provisioned into a specific virtual network, running natively on a virtual machine. Although some Azure resources offer [virtual network service endpoints](/azure/virtual-network/virtual-network-service-endpoints-overview), this functionality won't be offered by Azure AI Search. You should plan on implementing one of the following approaches.
112
108
113
-
Given the above constrains, your choices for achieving search integration in a virtual network are:
109
+
| Approach | Details |
110
+
|----------|---------|
111
+
| Inbound connection to your Azure resource | Configure an inbound firewall rule on your Azure resource that admits indexer requests for your data. Your firewall configuration should include the service tag for multitenant execution and the IP address of your search service. |
112
+
| Private connection between Azure AI Search and your Azure resource | Configure a shared private link used exclusively by your search service for connections to your resource. Connections travel over the internal network and bypass the public internet. If your resources are fully locked down (running on a protected virtual network, or otherwise not available over a public connection), a private endpoint is your only choice. See [Make outbound connections through a private endpoint](search-indexer-howto-access-private.md).|
114
113
115
-
- Configure an inbound firewall rule on your Azure PaaS resource that admits indexer requests for data. Follow that up with role assignments that specify which users and groups have read and write access to your data and operations.
114
+
Connections through a private endpoint must originate from the search service's private execution environment.
116
115
117
-
- Configure an outbound connection from Search that makes indexer connections using a [private endpoint](../private-link/private-endpoint-overview.md).
116
+
Configuring an IP firewall is free. A private endpoint, which is based on Azure Private Link, has a billing impact. See [Azure Private Link pricing](https://azure.microsoft.com/pricing/details/private-link/) for details.
118
117
119
-
For a private endpoint, the search service connection to your protected resource is through a *shared private link*. A shared private link is an [Azure Private Link](../private-link/private-link-overview.md) resource that's created, managed, and used from within Azure AI Search. If your resources are fully locked down (running on a protected virtual network, or otherwise not available over a public connection), a private endpoint is your only choice.
118
+
After you configure network security, follow up with role assignments that specify which users and groups have read and write access to your data and operations.
120
119
121
-
Connections through a private endpoint must originate from the search service's private execution environment. To meet this requirement, you'll have to disable multitenant execution. This step is described in [Make outbound connections through a private endpoint](search-indexer-howto-access-private.md).
120
+
### Considerations for using a private endpoint
122
121
123
-
Configuring an IP firewall is free. A private endpoint, which is based on Azure Private Link, has a billing impact.
122
+
This section narrows in on the private connection option.
124
123
125
-
### Working with a private endpoint
124
+
+ A shared private link requires a billable search service, where the minimum tier is either Basic for text-based indexing or Standard 2 (S2) for skills-based indexing. See [tier limits on the number of private endpoints](search-limits-quotas-capacity.md#shared-private-link-resource-limits) for details.
126
125
127
-
This section summarizes the main steps for setting up a private endpoint for outbound indexer connections. This summary might help you decide whether a private endpoint is the best choice for your scenario. Detailed steps are covered in [How to make outbound connections through a private endpoint](search-indexer-howto-access-private.md).
126
+
- Once a shared private link is created, the search service always uses it for every indexer connection to that specific Azure resource. The private connection is locked and enforced internally. You can't bypass the private connection for a public connection.
128
127
129
-
#### Billing impact of Azure Private Link
128
+
- Requires a billable Azure Private Link resource.
130
129
131
-
-A shared private link requires a billable search service, where the minimum tier is either Basic for text-based indexing or Standard 2 (S2) for skills-based indexing. See [tier limits on the number of private endpoints](search-limits-quotas-capacity.md#shared-private-link-resource-limits) for details.
130
+
-Requires that a subscription owner approve the private endpoint connection.
132
131
133
-
-Inbound and outbound connections are subject to [Azure Private Link pricing](https://azure.microsoft.com/pricing/details/private-link/).
132
+
-Requires that you turn off the multitenant execution environment for the indexer.
134
133
135
-
#### Step 1: Create a private endpoint to the secure resource
136
-
137
-
You'll create a shared private link using either the portal pages of your search service or through the [Management API](/rest/api/searchmanagement/shared-private-link-resources/create-or-update).
138
-
139
-
In Azure AI Search, your search service must be at least the Basic tier for text-based indexers, and S2 for indexers with skillsets.
140
-
141
-
A private endpoint connection will accept requests from the private indexer execution environment, but not the multitenant environment. You'll need to disable multitenant execution as described in step 3 to meet this requirement.
142
-
143
-
#### Step 2: Approve the private endpoint connection
144
-
145
-
When the (asynchronous) operation that creates a shared private link resource completes, a private endpoint connection will be created in a "Pending" state. No traffic flows over the connection yet.
146
-
147
-
You'll need to locate and approve this request on your secure resource. Depending on the resource, you can complete this task using Azure portal. Otherwise, use the [Private Link Service REST API](/rest/api/virtualnetwork/privatelinkservices/updateprivateendpointconnection).
148
-
149
-
#### Step 3: Force indexers to run in the "private" environment
150
-
151
-
For private endpoint connections, it's mandatory to set the `executionEnvironment` of the indexer to `"Private"`. This step ensures that all indexer execution is confined to the private environment provisioned within the search service.
152
-
153
-
This setting is scoped to an indexer and not the search service. If you want all indexers to connect over private endpoints, each one must have the following configuration:
154
-
155
-
```json
156
-
{
157
-
"name" : "myindexer",
158
-
... other indexer properties
159
-
"parameters" : {
160
-
... other parameters
161
-
"configuration" : {
162
-
... other configuration properties
163
-
"executionEnvironment": "Private"
134
+
You do this by setting the `executionEnvironment` of the indexer to `"Private"`. This step ensures that all indexer execution is confined to the private environment provisioned within the search service. This setting is scoped to an indexer and not the search service. If you want all indexers to connect over private endpoints, each one must have the following configuration:
135
+
136
+
```json
137
+
{
138
+
"name" : "myindexer",
139
+
... other indexer properties
140
+
"parameters" : {
141
+
... other parameters
142
+
"configuration" : {
143
+
... other configuration properties
144
+
"executionEnvironment": "Private"
145
+
}
164
146
}
165
-
}
166
-
}
167
-
```
147
+
}
148
+
```
168
149
169
150
Once you have an approved private endpoint to a resource, indexers that are set to be *private* attempt to obtain access via the private link that was created and approved for the Azure resource.
170
151
171
152
Azure AI Search will validate that callers of the private endpoint have appropriate role assignments. For example, if you request a private endpoint connection to a storage account with read-only permissions, this call will be rejected.
172
153
173
154
If the private endpoint isn't approved, or if the indexer didn't use the private endpoint connection, you'll find a `transientFailure` error message in indexer execution history.
174
155
156
+
## Supplement network security with token authentication
157
+
158
+
Firewalls and network security are a first step in preventing unauthorized access to data and operations. Authorization should be your next step.
159
+
160
+
We recommend role-based access, where Microsoft Entra ID users and groups are assigned to roles that determine read and write access to your service. See [Connect to Azure AI Search using role-based access controls](search-security-rbac.md) for a description of built-in roles and instructions for creating custom roles.
161
+
162
+
If you don't need key-based authentication, we recommend that you disable API keys and use role assignments exclusively.
163
+
175
164
## Access to a network-protected storage account
176
165
177
166
A search service stores indexes and synonym lists. For other features that require storage, Azure AI Search takes a dependency on Azure Storage. Enrichment caching, debug sessions, and knowledge stores fall into this category. The location of each service, and any network protections in place for storage, will determine your data access strategy.
0 commit comments