You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: samples/features/sql2019notebooks/README.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,6 +26,13 @@ The [What's New](https://docs.microsoft.com/sql/sql-server/what-s-new-in-sql-ser
26
26
***[Basic_ADR.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/accelerated-database-recovery/basic_adr.ipynb)** - In this notebook, you will see how fast long-running transaction rollback can now be with Accelerated Database Recovery. You will also see that a long active transaction does not affect the ability to truncate the transaction log.
27
27
***[Recovery_ADR.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/accelerated-database-recovery/recovery_adr.ipynb)** - In this example, you will see how Accelerated Database Recovery will speed up recovery.
28
28
29
+
### Unicode Support (UTF-8 and UTF-16)
30
+
***[DataType_WesternMyth.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/unicode/notebooks/DataType_WesternMyth.ipynb)** - In this notebook, you will see proof that the integer that defines the length of string types (CHAR/VARCHAR/NCHAR/NVARCHAR) does not mean "number of characters" but "number of byte sto store", debunking a common misconception in SQL Server.
31
+
***[Functional.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/unicode/notebooks/Functional.ipynb)** - In this notebook, you will see how to use UTF-8 in your database or columns.
32
+
***[Storage.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/unicode/notebooks/Storage.ipynb)** - In this notebook, you will see how to the storage footprint differences are expressive between Unicode encoded in UTF-8 and UTF-16.
33
+
***[Perf_Latin.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/unicode/notebooks/Perf_Latin.ipynb)** - In this notebook, you will see the performance differences of using string data encoded in UTF-8 and UTF-16 using Latin data.
34
+
***[Perf_Non-Latin.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/unicode/notebooks/Perf_Non-Latin.ipynb)** - In this notebook, you will see the performance differences of using string data encoded in UTF-8 and UTF-16 using non-Latin data.
35
+
29
36
### SQL Server 2019 Querying 1 TRILLION rows
30
37
***[OneTrillionRowsWarm.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/sql2019notebooks/OneTrillionRowsWarm.ipynb)** - This notebook shows how SQL Server 2019 reads **9 BILLION rows/second** using a 1 trillion row table using a warm cache,
31
38
***[OneTrillionRowsCold.ipynb](https://github.com/microsoft/sql-server-samples/blob/master/samples/features/sql2019notebooks/OneTrillionRowsCold.ipynb)** - This notebook shows how SQL Server 2019 performs IO at **~24GB/s** using a 1 trillion row table with a cold cache.
See how many bytes each character requires for both UTF-8 and UTF-16 encodings.
144
+
Returns all 65,536 BMP (Base Multilingual Plan) characters (which is also the entire UCS-2 character set), and 3 Supplementary Characters.
145
+
Since all Supplementary Characters are 4 bytes in both encodings, there is no need to return more of them, but we do need to see a few of them to see that they are:
146
+
a) all 4 bytes
147
+
b) encoded slightly differently
148
+
*/
149
+
;
150
+
151
+
WITH nums ([CodePoint])
152
+
AS (
153
+
SELECTTOP (65536) (
154
+
ROW_NUMBER() OVER (
155
+
ORDER BY (
156
+
SELECT0
157
+
)
158
+
) -1
159
+
)
160
+
FROM [master].[sys].[columns] col
161
+
CROSSJOIN [master].[sys].[objects] obj
162
+
), chars
163
+
AS (
164
+
SELECT nums.[CodePoint], CONVERT(VARCHAR(4), NCHAR(nums.[CodePoint]) COLLATE Latin1_General_100_CI_AS_SC_UTF8) AS [TheChar], CONVERT(VARBINARY(4), CONVERT(VARCHAR(4), NCHAR(nums.[CodePoint]) COLLATE Latin1_General_100_CI_AS_SC_UTF8)) AS [UTF8]
165
+
FROM nums
166
+
167
+
UNION ALL
168
+
169
+
SELECTtmp.val, CONVERT(VARCHAR(4), CONVERT(NVARCHAR(5), tmp.hex) COLLATE Latin1_General_100_CI_AS_SC_UTF8) AS [TheChar], CONVERT(VARBINARY(4), CONVERT(VARCHAR(4), CONVERT(NVARCHAR(5), tmp.hex) COLLATE Latin1_General_100_CI_AS_SC_UTF8)) AS [UTF8]
170
+
FROM (
171
+
VALUES (65536, 0x00D800DC), -- Linear B Syllable B008 A (U+10000)
172
+
(67618, 0x02D822DC), -- Cypriot Syllable Pu (U+10822)
173
+
(129384, 0x3ED868DD) -- Pretzel (U+1F968)
174
+
) tmp(val, hex)
175
+
)
176
+
SELECT chr.[CodePoint], COALESCE(chr.[TheChar], N'TOTALS:') AS [Character], chr.[UTF8] AS [UTF8_Hex], DATALENGTH(chr.[UTF8]) AS [UTF8_Bytes], COUNT(CASE DATALENGTH(chr.[UTF8]) WHEN1THEN'x'END) AS [1-byte], COUNT(CASE DATALENGTH(chr.[UTF8]) WHEN2THEN'x'END) AS [2-bytes], COUNT(CASE DATALENGTH(chr.[UTF8]) WHEN3THEN'x'END) AS [3-bytes], COUNT(CASE DATALENGTH(chr.[UTF8]) WHEN4THEN'x'END) AS [4-bytes],
177
+
---
178
+
CONVERT(VARBINARY(4), CONVERT(NVARCHAR(3), chr.[TheChar])) AS [UTF16(LE)_Hex], DATALENGTH(CONVERT(NVARCHAR(3), chr.[TheChar])) AS [UTF16_Bytes],
179
+
---
180
+
((DATALENGTH(CONVERT(NVARCHAR(3), chr.[TheChar]))) - (DATALENGTH(chr.[TheChar]))) AS [UTF8savingsOverUTF16]
181
+
FROM chars chr
182
+
GROUP BYROLLUP((chr.[CodePoint], chr.[TheChar], chr.[UTF8]));
0 commit comments