Skip to content

Commit 09e68ea

Browse files
Merge pull request #34479 from MicrosoftDocs/main
Merged by Learn.Build PR Management system
2 parents e11a3e4 + a44de91 commit 09e68ea

File tree

3 files changed

+238
-204
lines changed

3 files changed

+238
-204
lines changed

docs/relational-databases/json/index-json-data.md

Lines changed: 138 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,196 +1,215 @@
11
---
2-
title: "Index JSON data"
2+
title: "Index JSON Data"
33
description: "Index JSON data"
44
author: WilliamDAssafMSFT
55
ms.author: wiassaf
6-
ms.reviewer: jroth, jovanpop
7-
ms.date: 08/20/2024
6+
ms.reviewer: jroth, jovanpop, randolphwest
7+
ms.date: 06/19/2025
88
ms.service: sql
9+
ms.topic: how-to
910
ms.custom:
1011
- build-2024
11-
ms.topic: how-to
1212
helpviewer_keywords:
1313
- "JSON, indexing JSON data"
1414
- "indexing JSON data"
15-
monikerRange: "=azuresqldb-current||>=sql-server-2016||>=sql-server-linux-2017||=azuresqldb-mi-current"
15+
monikerRange: "=azuresqldb-current || >=sql-server-2016 || >=sql-server-linux-2017 || =azuresqldb-mi-current"
1616
---
1717
# Index JSON data
18+
1819
[!INCLUDE [SQL Server Azure SQL Database Azure SQL Managed Instance](../../includes/applies-to-version/sqlserver2016-asdb-asdbmi.md)]
1920

20-
You can optimize your queries over JSON documents using standard indexes. SQL Server does not have custom JSON indexes.
21+
You can optimize your queries over JSON documents using standard indexes.
2122

22-
- Currently, in SQL Server **json** is not a built-in data type.
23-
- The [JSON data type](../../t-sql/data-types/json-data-type.md) is currently in preview for Azure SQL Database and Azure SQL Managed Instance (configured with the [**Always-up-to-date** update policy](/azure/azure-sql/managed-instance/update-policy#always-up-to-date-update-policy)).
23+
The [JSON data type](../../t-sql/data-types/json-data-type.md):
24+
25+
- is generally available for Azure SQL Database and Azure SQL Managed Instance configured with the **[Always-up-to-date update policy](/azure/azure-sql/managed-instance/update-policy#always-up-to-date-update-policy)**.
26+
- is in preview for [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)].
27+
28+
> [!NOTE]
29+
> In [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)], you can use the [CREATE JSON INDEX](../../t-sql/statements/create-json-index-transact-sql.md) feature.
2430
2531
Indexes work the same way on JSON data in **varchar**/**nvarchar** or the [native **json** data type](../../t-sql/data-types/json-data-type.md).
2632

27-
Database indexes improve the performance of filter and sort operations. Without indexes, SQL Server has to perform a full table scan every time you query data.
28-
33+
Database indexes improve the performance of filter and sort operations. Without indexes, SQL Server has to perform a full table scan every time you query data.
34+
2935
## Index JSON properties by using computed columns
30-
When you store JSON data in SQL Server, typically you want to filter or sort query results by one or more *properties* of the JSON documents.
36+
37+
When you store JSON data in SQL Server, typically you want to filter or sort query results by one or more *properties* of the JSON documents.
3138

3239
### Example
33-
In this example, assume that the `AdventureWorks.SalesOrderHeader` table has an `Info` column that contains various information in JSON format about sales orders. For example, it contains unstructured data about customer, sales person, shipping and billing addresses, and so forth. You could use values from the `Info` column to filter sales orders for a customer.
3440

35-
By default, the column `Info` used does not exist, it can be created in the `AdventureWorks` database with the following code. The following examples do not apply to the `AdventureWorksLT` series of sample databases.
41+
In this example, assume that the `AdventureWorks.SalesOrderHeader` table has an `Info` column that contains various information in JSON format about sales orders. For example, it contains unstructured data about customer, sales person, shipping and billing addresses, and so forth. You could use values from the `Info` column to filter sales orders for a customer.
42+
43+
By default, the column `Info` used doesn't exist, it can be created in the `AdventureWorks` database with the following code. The following examples don't apply to the `AdventureWorksLT` series of sample databases.
3644

37-
```sql
38-
IF NOT EXISTS(SELECT * FROM sys.columns WHERE object_id = OBJECT_ID('[Sales].[SalesOrderHeader]') AND name = 'Info')
39-
ALTER TABLE [Sales].[SalesOrderHeader] ADD [Info] NVARCHAR(MAX) NULL
45+
```sql
46+
IF NOT EXISTS (SELECT *
47+
FROM sys.columns
48+
WHERE object_id = OBJECT_ID('[Sales].[SalesOrderHeader]')
49+
AND name = 'Info')
50+
ALTER TABLE [Sales].[SalesOrderHeader]
51+
ADD [Info] NVARCHAR (MAX) NULL;
4052
GO
41-
UPDATE h
53+
54+
UPDATE h
4255
SET [Info] =
4356
(
44-
SELECT [Customer.Name] = concat(p.FirstName, N' ', p.LastName),
45-
[Customer.ID] = p.BusinessEntityID,
46-
[Customer.Type] = p.[PersonType],
47-
[Order.ID] = soh.SalesOrderID,
48-
[Order.Number] = soh.SalesOrderNumber,
57+
SELECT [Customer.Name] = concat(p.FirstName, N' ', p.LastName),
58+
[Customer.ID] = p.BusinessEntityID,
59+
[Customer.Type] = p.[PersonType],
60+
[Order.ID] = soh.SalesOrderID,
61+
[Order.Number] = soh.SalesOrderNumber,
4962
[Order.CreationData] = soh.OrderDate,
5063
[Order.TotalDue] = soh.TotalDue
5164
FROM [Sales].SalesOrderHeader AS soh
52-
INNER JOIN [Sales].[Customer] AS c ON c.CustomerID = soh.CustomerID
53-
INNER JOIN [Person].[Person] AS p ON p.BusinessEntityID = c.CustomerID
54-
WHERE soh.SalesOrderID = h.SalesOrderID FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
65+
INNER JOIN [Sales].[Customer] AS c
66+
ON c.CustomerID = soh.CustomerID
67+
INNER JOIN [Person].[Person] AS p
68+
ON p.BusinessEntityID = c.CustomerID
69+
WHERE soh.SalesOrderID = h.SalesOrderID
70+
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
5571
)
56-
FROM [Sales].SalesOrderHeader AS h;
57-
```
72+
FROM [Sales].SalesOrderHeader AS h;
73+
```
5874

5975
### Query to optimize
60-
Here's an example of the type of query that you want to optimize by using an index.
61-
62-
```sql
76+
77+
Here's an example of the type of query that you want to optimize by using an index.
78+
79+
```sql
6380
SELECT SalesOrderNumber,
64-
OrderDate,
65-
JSON_VALUE(Info, '$.Customer.Name') AS CustomerName
81+
OrderDate,
82+
JSON_VALUE(Info, '$.Customer.Name') AS CustomerName
6683
FROM Sales.SalesOrderHeader
67-
WHERE JSON_VALUE(Info, '$.Customer.Name') = N'Aaron Campbell'
68-
```
84+
WHERE JSON_VALUE(Info, '$.Customer.Name') = N'Aaron Campbell';
85+
```
6986

7087
### Example index
88+
7189
If you want to speed up your filters or `ORDER BY` clauses over a property in a JSON document, you can use the same indexes that you're already using on other columns. However, you can't *directly* reference properties in the JSON documents.
7290

7391
1. First, create a "virtual column" that returns the values that you want to use for filtering.
74-
1. Then, create an index on that virtual column.
75-
76-
The following example creates a computed column that can be used for indexing. Then it creates an index on the new computed column. This example creates a column that exposes the customer name, which is stored in the `$.Customer.Name` path in the JSON data.
77-
78-
```sql
92+
1. Then, create an index on that virtual column.
93+
94+
The following example creates a computed column that can be used for indexing. Then it creates an index on the new computed column. This example creates a column that exposes the customer name, which is stored in the `$.Customer.Name` path in the JSON data.
95+
96+
```sql
7997
ALTER TABLE Sales.SalesOrderHeader
80-
ADD vCustomerName AS JSON_VALUE(Info,'$.Customer.Name')
98+
ADD vCustomerName AS JSON_VALUE(Info, '$.Customer.Name');
8199

82100
CREATE INDEX idx_soh_json_CustomerName
83-
ON Sales.SalesOrderHeader(vCustomerName)
84-
```
101+
ON Sales.SalesOrderHeader(vCustomerName);
102+
```
85103

86-
This statement will return the following warning:
104+
This statement returns the following warning:
87105

88106
```output
89107
Warning! The maximum key length for a nonclustered index is 1700 bytes.
90108
The index 'vCustomerName' has maximum length of 8000 bytes.
91109
For some combination of large values, the insert/update operation will fail.
92110
```
93111

94-
The `JSON_VALUE` function might return text values up to 8000 bytes (for example, as the **nvarchar(4000)** type). However, the values that are longer than 1700 bytes cannot be indexed. If you try to enter the value in the indexed computed column that is longer than 1700 bytes, the data manipulation language (DML) operation will fail.
112+
The `JSON_VALUE` function might return text values up to 8000 bytes (for example, as the **nvarchar(4000)** type). However, the values that are longer than 1700 bytes can't be indexed. If you try to enter the value in the indexed computed column that is longer than 1700 bytes, the data manipulation language (DML) operation fails.
95113

96114
For better performance, try to cast the value that you expose using a computed column into the smallest applicable data type. Use **int** and **datetime2** types instead of string types.
97115

98116
### More info about the computed column
99-
A computed column is not persisted. A computer column computed only when the index needs to be rebuilt. It does not occupy additional space in the table.
100-
101-
It's important that you create the computed column with the same expression that you plan to use in your queries - in this example, the expression is `JSON_VALUE(Info, '$.Customer.Name')`.
102-
117+
118+
A computed column isn't persisted. A computed column is only computed when the index needs to be rebuilt. It doesn't occupy additional space in the table.
119+
120+
It's important that you create the computed column with the same expression that you plan to use in your queries - in this example, the expression is `JSON_VALUE(Info, '$.Customer.Name')`.
121+
103122
You don't have to rewrite your queries. If you use expressions with the `JSON_VALUE` function, as shown in the preceding example query, SQL Server sees that there's an equivalent computed column with the same expression and applies an index if possible.
104123

105124
### Execution plan for this example
106-
Here's the execution plan for the query in this example.
107-
125+
126+
Here's the execution plan for the query in this example.
127+
108128
:::image type="content" source="media/index-json-data/json-index-seek.png" alt-text="Screenshot showing the execution plan for this example.":::
109-
110-
Instead of a full table scan, SQL Server uses an index seek into the nonclustered index and finds the rows that satisfy the specified conditions. Then it uses a key lookup in the `SalesOrderHeader` table to fetch the other columns that are referenced in the query - in this example, `SalesOrderNumber` and `OrderDate`.
129+
130+
Instead of a full table scan, SQL Server uses an index seek into the nonclustered index and finds the rows that satisfy the specified conditions. Then it uses a key lookup in the `SalesOrderHeader` table to fetch the other columns that are referenced in the query - in this example, `SalesOrderNumber` and `OrderDate`.
111131

112132
### Optimize the index further with included columns
113-
If you add required columns in the index, you can avoid this additional lookup in the table. You can add these columns as standard included columns, as shown in the following example, which extends the preceding `CREATE INDEX` example.
114-
115-
```sql
133+
134+
If you add required columns in the index, you can avoid this extra lookup in the table. You can add these columns as standard included columns, as shown in the following example, which extends the preceding `CREATE INDEX` example.
135+
136+
```sql
116137
CREATE INDEX idx_soh_json_CustomerName
117-
ON Sales.SalesOrderHeader(vCustomerName)
118-
INCLUDE(SalesOrderNumber,OrderDate)
119-
```
120-
121-
In this case, SQL Server doesn't have to read additional data from the `SalesOrderHeader` table because everything it needs is included in the nonclustered JSON index. This type of index is a good way to combine JSON and column data in queries and to create optimal indexes for your workload.
122-
138+
ON Sales.SalesOrderHeader(vCustomerName)
139+
INCLUDE(SalesOrderNumber, OrderDate);
140+
```
141+
142+
In this case, SQL Server doesn't have to read more data from the `SalesOrderHeader` table because everything it needs is included in the nonclustered JSON index. This type of index is a good way to combine JSON and column data in queries and to create optimal indexes for your workload.
143+
123144
## JSON indexes are collation-aware indexes
124-
An important feature of indexes over JSON data is that the indexes are collation-aware. The result of the `JSON_VALUE` function that you use when you create the computed column is a text value that inherits its collation from the input expression. Therefore, values in the index are ordered using the collation rules defined in the source columns.
125-
126-
To demonstrate that the indexes are collation-aware, the following example creates a simple collection table with a primary key and JSON content.
127-
128-
```sql
145+
146+
An important feature of indexes over JSON data is that the indexes are collation-aware. The result of the `JSON_VALUE` function that you use when you create the computed column is a text value that inherits its collation from the input expression. Therefore, values in the index are ordered using the collation rules defined in the source columns.
147+
148+
To demonstrate that the indexes are collation-aware, the following example creates a simple collection table with a primary key and JSON content.
149+
150+
```sql
129151
CREATE TABLE JsonCollection
130-
(
131-
id INT IDENTITY CONSTRAINT PK_JSON_ID PRIMARY KEY,
132-
[json] NVARCHAR(MAX) COLLATE SERBIAN_CYRILLIC_100_CI_AI
133-
CONSTRAINT [Content should be formatted as JSON]
134-
CHECK(ISJSON(json)>0)
135-
)
136-
```
137-
138-
The preceding command specifies the Serbian Cyrillic collation for the `json` column. The following example populates the table and creates an index on the name property.
139-
140-
```sql
152+
(
153+
id INT IDENTITY CONSTRAINT PK_JSON_ID PRIMARY KEY,
154+
[json] NVARCHAR (MAX) COLLATE SERBIAN_CYRILLIC_100_CI_AI
155+
CONSTRAINT [Content should be formatted as JSON] CHECK (ISJSON(json) > 0)
156+
);
157+
```
158+
159+
The preceding command specifies the Serbian Cyrillic collation for the `json` column. The following example populates the table and creates an index on the name property.
160+
161+
```sql
141162
INSERT INTO JsonCollection
142163
VALUES
143-
(N'{"name":"Иво","surname":"Андрић"}'),
144-
(N'{"name":"Андрија","surname":"Герић"}'),
145-
(N'{"name":"Владе","surname":"Дивац"}'),
146-
(N'{"name":"Новак","surname":"Ђоковић"}'),
147-
(N'{"name":"Предраг","surname":"Стојаковић"}'),
148-
(N'{"name":"Михајло","surname":"Пупин"}'),
149-
(N'{"name":"Борислав","surname":"Станковић"}'),
150-
(N'{"name":"Владимир","surname":"Грбић"}'),
151-
(N'{"name":"Жарко","surname":"Паспаљ"}'),
152-
(N'{"name":"Дејан","surname":"Бодирога"}'),
153-
(N'{"name":"Ђорђе","surname":"Вајферт"}'),
154-
(N'{"name":"Горан","surname":"Бреговић"}'),
155-
(N'{"name":"Милутин","surname":"Миланковић"}'),
156-
(N'{"name":"Никола","surname":"Тесла"}')
164+
(N'{"name":"Иво","surname":"Андрић"}'),
165+
(N'{"name":"Андрија","surname":"Герић"}'),
166+
(N'{"name":"Владе","surname":"Дивац"}'),
167+
(N'{"name":"Новак","surname":"Ђоковић"}'),
168+
(N'{"name":"Предраг","surname":"Стојаковић"}'),
169+
(N'{"name":"Михајло","surname":"Пупин"}'),
170+
(N'{"name":"Борислав","surname":"Станковић"}'),
171+
(N'{"name":"Владимир","surname":"Грбић"}'),
172+
(N'{"name":"Жарко","surname":"Паспаљ"}'),
173+
(N'{"name":"Дејан","surname":"Бодирога"}'),
174+
(N'{"name":"Ђорђе","surname":"Вајферт"}'),
175+
(N'{"name":"Горан","surname":"Бреговић"}'),
176+
(N'{"name":"Милутин","surname":"Миланковић"}'),
177+
(N'{"name":"Никола","surname":"Тесла"}');
157178
GO
158-
179+
159180
ALTER TABLE JsonCollection
160-
ADD vName AS JSON_VALUE(json,'$.name')
181+
ADD vName AS JSON_VALUE(json, '$.name');
161182

162183
CREATE INDEX idx_name
163-
ON JsonCollection(vName)
164-
```
165-
166-
The preceding commands create a standard index on the computed column `vName`, which represents the value from the JSON `$.name` property. In the Serbian Cyrillic code page, the order of the letters is `А`, `Б`, `В`, `Г`, `Д`, `Ђ`, `Е`, etc. The order of items in the index is compliant with Serbian Cyrillic rules because the result of the `JSON_VALUE` function inherits its collation from the source column. The following example queries this collection and sorts the results by name.
167-
168-
```sql
169-
SELECT JSON_VALUE(json,'$.name'),*
184+
ON JsonCollection(vName);
185+
```
186+
187+
The preceding commands create a standard index on the computed column `vName`, which represents the value from the JSON `$.name` property. In the Serbian Cyrillic code page, the order of the letters is `А`, `Б`, `В`, `Г`, `Д`, `Ђ`, `Е`, etc. The order of items in the index is compliant with Serbian Cyrillic rules because the result of the `JSON_VALUE` function inherits its collation from the source column. The following example queries this collection and sorts the results by name.
188+
189+
```sql
190+
SELECT JSON_VALUE(json, '$.name'),
191+
*
170192
FROM JsonCollection
171-
ORDER BY JSON_VALUE(json,'$.name')
172-
```
173-
174-
If you look at the actual execution plan, you see that it uses sorted values from the nonclustered index.
175-
176-
:::image type="content" source="media/index-json-data/json-index-scan.png" alt-text="Screenshot showing an execution plan that uses sorted values from the non-clustered index." lightbox="media/index-json-data/json-index-scan.png":::
177-
178-
Although the query has an `ORDER BY` clause, the execution plan doesn't use a Sort operator. The JSON index is already ordered according to Serbian Cyrillic rules. Therefore SQL Server can use the nonclustered index where results are already sorted.
179-
180-
However, if you change the collation of the `ORDER BY` expression - for example, if you add `COLLATE French_100_CI_AS_SC` after the `JSON_VALUE` function - you get a different query execution plan.
181-
182-
:::image type="content" source="media/index-json-data/json-index-execution-plan.png" alt-text="Screenshot showing a different execution plan." lightbox="media/index-json-data/json-index-execution-plan.png":::
183-
184-
Since the order of values in the index is not compliant with French collation rules, SQL Server can't use the index to order results. Therefore, it adds a Sort operator that sorts results using French collation rules.
193+
ORDER BY JSON_VALUE(json, '$.name');
194+
```
185195

186-
### Microsoft videos
196+
If you look at the actual execution plan, you see that it uses sorted values from the nonclustered index.
197+
198+
:::image type="content" source="media/index-json-data/json-index-scan.png" alt-text="Screenshot showing an execution plan that uses sorted values from the nonclustered index." lightbox="media/index-json-data/json-index-scan.png":::
187199

188-
> [!NOTE]
189-
> Some of the video links in this section might not work at this time. Microsoft is migrating content formerly on Channel 9 to a new platform. We will update the links as the videos are migrated to the new platform.
200+
Although the query has an `ORDER BY` clause, the execution plan doesn't use a Sort operator. The JSON index is already ordered according to Serbian Cyrillic rules. Therefore SQL Server can use the nonclustered index where results are already sorted.
201+
202+
However, if you change the collation of the `ORDER BY` expression - for example, if you add `COLLATE French_100_CI_AS_SC` after the `JSON_VALUE` function - you get a different query execution plan.
203+
204+
:::image type="content" source="media/index-json-data/json-index-execution-plan.png" alt-text="Screenshot showing a different execution plan." lightbox="media/index-json-data/json-index-execution-plan.png":::
205+
206+
Since the order of values in the index isn't compliant with French collation rules, SQL Server can't use the index to order results. Therefore, it adds a Sort operator that sorts results using French collation rules.
207+
208+
### Microsoft videos
190209

191-
For a visual introduction to the built-in JSON support in SQL Server and Azure SQL Database, see the following videos:
210+
For a visual introduction to the built-in JSON support in SQL Server and Azure SQL Database, see the following video:
192211

193-
- [JSON as a bridge between NoSQL and relational worlds](https://channel9.msdn.com/events/DataDriven-SQLServer2016/JSON-as-bridge-betwen-NoSQL-relational-worlds)
212+
- [JSON as a bridge between NoSQL and relational worlds](/shows/datadriven-sqlserver2016/json-as-bridge-betwen-nosql-relational-worlds)
194213

195214
## Related content
196215

0 commit comments

Comments
 (0)