MicrosoftDocs
diff --git a/‎articles/storage/blobs/data-lake-storage-abfs-driver.md
Lines changed: 5 additions & 5 deletions b/‎articles/storage/blobs/data-lake-storage-abfs-driver.md
Lines changed: 5 additions & 5 deletions
diff --git a/‎articles/storage/blobs/data-lake-storage-best-practices.md
Lines changed: 4 additions & 4 deletions b/‎articles/storage/blobs/data-lake-storage-best-practices.md
Lines changed: 4 additions & 4 deletions
diff --git a/‎articles/storage/blobs/data-lake-storage-introduction-abfs-uri.md
Lines changed: 6 additions & 6 deletions b/‎articles/storage/blobs/data-lake-storage-introduction-abfs-uri.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎articles/storage/blobs/data-lake-storage-introduction.md
Lines changed: 3 additions & 2 deletions b/‎articles/storage/blobs/data-lake-storage-introduction.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎articles/storage/blobs/data-lake-storage-multi-protocol-access.md
Lines changed: 3 additions & 3 deletions b/‎articles/storage/blobs/data-lake-storage-multi-protocol-access.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/storage/blobs/data-lake-storage-namespace.md
Lines changed: 3 additions & 3 deletions b/‎articles/storage/blobs/data-lake-storage-namespace.md
Lines changed: 3 additions & 3 deletions
@@ -7,30 +7,30 @@ author: normesta
 ms.topic: conceptual
 ms.author: normesta
 ms.reviewer: jamesbak
-ms.date: 12/06/2018
+ms.date: 03/09/2023
 ms.service: storage
 ms.subservice: data-lake-storage-gen2
 ---
 
 # The Azure Blob Filesystem driver (ABFS): A dedicated Azure Storage driver for Hadoop
 
-One of the primary access methods for data in Azure Data Lake Storage Gen2 is via the [Hadoop FileSystem](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html). Data Lake Storage Gen2 allows users of Azure Blob Storage access to a new driver, the Azure Blob File System driver or `ABFS`. ABFS is part of Apache Hadoop and is included in many of the commercial distributions of Hadoop. Using this driver, many applications and frameworks can access data in Azure Blob Storage without any code explicitly referencing Data Lake Storage Gen2.
+One of the primary access methods for data in Azure Data Lake Storage Gen2 is via the [Hadoop FileSystem](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html). Data Lake Storage Gen2 allows users of Azure Blob Storage access to a new driver, the Azure Blob File System driver or `ABFS`. ABFS is part of Apache Hadoop and is included in many of the commercial distributions of Hadoop. By the ABFS driver, many applications and frameworks can access data in Azure Blob Storage without any code explicitly referencing Data Lake Storage Gen2.
 
 ## Prior capability: The Windows Azure Storage Blob driver
 
 The Windows Azure Storage Blob driver or [WASB driver](https://hadoop.apache.org/docs/current/hadoop-azure/index.html) provided the original support for Azure Blob Storage. This driver performed the complex task of mapping file system semantics (as required by the Hadoop FileSystem interface) to that of the object store style interface exposed by Azure Blob Storage. This driver continues to support this model, providing high performance access to data stored in blobs, but contains a significant amount of code performing this mapping, making it difficult to maintain. Additionally, some operations such as [FileSystem.rename()](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_renamePath_src_Path_d) and [FileSystem.delete()](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_deletePath_p_boolean_recursive) when applied to directories require the driver to perform a vast number of operations (due to object stores lack of support for directories) which often leads to degraded performance. The ABFS driver was designed to overcome the inherent deficiencies of WASB.
 
 ## The Azure Blob File System driver
 
-The [Azure Data Lake Storage REST interface](/rest/api/storageservices/data-lake-storage-gen2) is designed to support file system semantics over Azure Blob Storage. Given that the Hadoop FileSystem is also designed to support the same semantics there is no requirement for a complex mapping in the driver. Thus, the Azure Blob File System driver (or ABFS) is a mere client shim for the REST API.
+The [Azure Data Lake Storage REST interface](/rest/api/storageservices/data-lake-storage-gen2) is designed to support file system semantics over Azure Blob Storage. Given that the Hadoop file system is also designed to support the same semantics there's no requirement for a complex mapping in the driver. Thus, the Azure Blob File System driver (or ABFS) is a mere client shim for the REST API.
 
 However, there are some functions that the driver must still perform:
 
 ### URI scheme to reference data
 
-Consistent with other FileSystem implementations within Hadoop, the ABFS driver defines its own URI scheme so that resources (directories and files) may be distinctly addressed. The URI scheme is documented in [Use the Azure Data Lake Storage Gen2 URI](./data-lake-storage-introduction-abfs-uri.md). The structure of the URI is: `abfs[s]://file_system@account_name.dfs.core.windows.net/<path>/<path>/<file_name>`
+Consistent with other file system implementations within Hadoop, the ABFS driver defines its own URI scheme so that resources (directories and files) may be distinctly addressed. The URI scheme is documented in [Use the Azure Data Lake Storage Gen2 URI](./data-lake-storage-introduction-abfs-uri.md). The structure of the URI is: `abfs[s]://file_system@account_name.dfs.core.windows.net/<path>/<path>/<file_name>`
 
-Using the above URI format, standard Hadoop tools and frameworks can be used to reference these resources:
+By using this URI format, standard Hadoop tools and frameworks can be used to reference these resources:
 
 ```bash
 hdfs dfs -mkdir -p abfs://[email protected]/tutorials/flightdelays/data
 
@@ -7,7 +7,7 @@ author: normesta
 ms.subservice: data-lake-storage-gen2
 ms.service: storage
 ms.topic: conceptual
-ms.date: 09/29/2022
+ms.date: 03/09/2023
 ms.author: normesta
 ms.reviewer: sachins
 ---
@@ -37,7 +37,7 @@ Use the following pattern as you configure your account to use Blob storage feat
 
 #### Understand the terms used in documentation
 
-As you move between content sets, you'll notice some slight terminology differences. For example, content featured in the [Blob storage documentation](storage-blobs-introduction.md), will use the term *blob* instead of *file*. Technically, the files that you ingest to your storage account become blobs in your account. Therefore, the term is correct. However, the term *blob* can cause confusion if you're used to the term *file*. You'll also see the term *container* used to refer to a *file system*. Consider these terms as synonymous.
+As you move between content sets, you notice some slight terminology differences. For example, content featured in the [Blob storage documentation](storage-blobs-introduction.md), will use the term *blob* instead of *file*. Technically, the files that you ingest to your storage account become blobs in your account. Therefore, the term is correct. However, the term *blob* can cause confusion if you're used to the term *file*. You'll also see the term *container* used to refer to a *file system*. Consider these terms as synonymous.
 
 ## Consider premium
 
@@ -84,7 +84,7 @@ Consider pre-planning the structure of your data. File format, file size, and di
 
 ### File formats
 
-Data can be ingested in various formats. Data can be appear in human readable formats such as JSON, CSV, or XML or as compressed binary formats such as `.tar.gz`. Data can come in various sizes as well. Data can be composed of large files (a few terabytes) such as data from an export of a SQL table from your on-premises systems. Data can also come in the form of a large number of tiny files (a few kilobytes) such as data from real-time events from an Internet of things (IoT) solution. You can optimize efficiency and costs by choosing an appropriate file format and file size.
+Data can be ingested in various formats. Data can appear in human readable formats such as JSON, CSV, or XML or as compressed binary formats such as `.tar.gz`. Data can come in various sizes as well. Data can be composed of large files (a few terabytes) such as data from an export of a SQL table from your on-premises systems. Data can also come in the form of a large number of tiny files (a few kilobytes) such as data from real-time events from an Internet of things (IoT) solution. You can optimize efficiency and costs by choosing an appropriate file format and file size.
 
 Hadoop supports a set of file formats that are optimized for storing and processing structured data. Some common formats are Avro, Parquet, and Optimized Row Columnar (ORC) format. All of these formats are machine-readable binary file formats. They're compressed to help you manage file size. They have a schema embedded in each file, which makes them self-describing. The difference between these formats is in how data is stored. Avro stores data in a row-based format and the Parquet and ORC formats store data in a columnar format.
 
@@ -100,7 +100,7 @@ Larger files lead to better performance and reduced costs.
 
 Typically, analytics engines such as HDInsight have a per-file overhead that involves tasks such as listing, checking access, and performing various metadata operations. If you store your data as many small files, this can negatively affect performance. In general, organize your data into larger sized files for better performance (256 MB to 100 GB in size). Some engines and applications might have trouble efficiently processing files that are greater than 100 GB in size.
 
-Increasing file size can also reduce transaction costs. Read and write operations are billed in 4-megabyte increments so you're charged for operation whether or not the file contains 4 megabytes or only a few kilobytes. For pricing information, see [Azure Data Lake Storage pricing](https://azure.microsoft.com/pricing/details/storage/data-lake/).
+Increasing file size can also reduce transaction costs. Read and write operations are billed in 4 megabyte increments so you're charged for operation whether or not the file contains 4 megabytes or only a few kilobytes. For pricing information, see [Azure Data Lake Storage pricing](https://azure.microsoft.com/pricing/details/storage/data-lake/).
 
 Sometimes, data pipelines have limited control over the raw data, which has lots of small files. In general, we recommend that your system have some sort of process to aggregate small files into larger ones for use by downstream applications. If you're processing data in real time, you can use a real time streaming engine (such as [Azure Stream Analytics](../../stream-analytics/stream-analytics-introduction.md) or [Spark Streaming](https://databricks.com/glossary/what-is-spark-streaming)) together with a message broker (such as [Event Hubs](../../event-hubs/event-hubs-about.md) or [Apache Kafka](https://kafka.apache.org/)) to store your data as larger files. As you aggregate small files into larger ones, consider saving them in a read-optimized format such as [Apache Parquet](https://parquet.apache.org/) for downstream processing.
 
 
@@ -1,12 +1,12 @@
 ---
 title: Use the Azure Data Lake Storage Gen2 URI
 titleSuffix: Azure Storage
-description: Learn URI syntax for the abfs scheme identifier, which represents the Azure Blob File System driver (Hadoop Filesystem driver for Azure Data Lake Storage Gen2).
+description: Learn URI syntax for the ABFS scheme identifier, which represents the Azure Blob File System driver (Hadoop Filesystem driver for Azure Data Lake Storage Gen2).
 author: normesta
 
 ms.topic: conceptual
 ms.author: normesta
-ms.date: 12/06/2018
+ms.date: 03/09/2023
 ms.service: storage
 ms.subservice: data-lake-storage-gen2
 ms.reviewer: jamesbak
@@ -24,17 +24,17 @@ If the Data Lake Storage Gen2 capable account you wish to address **is not** set
 
 <pre>abfs[s]<sup>1</sup>://&lt;file_system&gt;<sup>2</sup>@&lt;account_name&gt;<sup>3</sup>.dfs.core.windows.net/&lt;path&gt;<sup>4</sup>/&lt;file_name&gt;<sup>5</sup></pre>
 
-1. **Scheme identifier**: The `abfs` protocol is used as the scheme identifier. If you add an 's' at the end (abfs<b><i>s</i></b>) then the ABFS Hadoop client driver will <i>ALWAYS</i> use Transport Layer Security (TLS) irrespective of the authentication method chosen. If you choose OAuth as your authentication then the client driver will always use TLS even if you specify 'abfs' instead of 'abfss' because OAuth solely relies on the TLS layer. Finally, if you choose to use the older method of storage account key, then the client driver will interpret 'abfs' to mean that you do not want to use TLS.
+1. **Scheme identifier**: The `abfs` protocol is used as the scheme identifier. If you add an `s` at the end (abfs<b><i>s</i></b>) then the ABFS Hadoop client driver will always use Transport Layer Security (TLS) irrespective of the authentication method chosen. If you choose OAuth as your authentication, then the client driver will always use TLS even if you specify `abfs` instead of `abfss` because OAuth solely relies on the TLS layer. Finally, if you choose to use the older method of storage account key, then the client driver interprets `abfs` to mean that you don't want to use TLS.
 
-2. **File system**: The parent location that holds the files and folders. This is the same as Containers in the Azure Storage Blobs service.
+2. **File system**: The parent location that holds the files and folders. This is the same as containers in the Azure Storage Blob service.
 
 3. **Account name**: The name given to your storage account during creation.
 
 4. **Paths**: A forward slash delimited (`/`) representation of the directory structure.
 
-5. **File name**: The name of the individual file. This parameter is optional if you are addressing a directory.
+5. **File name**: The name of the individual file. This parameter is optional if you're addressing a directory.
 
-However, if the account you wish to address is set as the default file system during account creation, then the shorthand URI syntax is:
+However, if the account you want to address is set as the default file system during account creation, then the shorthand URI syntax is:
 
 <pre>/&lt;path&gt;<sup>1</sup>/&lt;file_name&gt;<sup>2</sup></pre>
 
 
@@ -6,7 +6,7 @@ author: normesta
 
 ms.service: storage
 ms.topic: overview
-ms.date: 03/01/2023
+ms.date: 03/09/2023
 ms.author: normesta
 ms.reviewer: jamesbak
 ms.subservice: data-lake-storage-gen2
@@ -16,7 +16,7 @@ ms.subservice: data-lake-storage-gen2
 
 Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on [Azure Blob Storage](storage-blobs-introduction.md).
 
-Data Lake Storage Gen2 converges the capabilities of [Azure Data Lake Storage Gen1](../../data-lake-store/index.yml) with Azure Blob Storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Because these capabilities are built on Blob storage, you'll also get low-cost, tiered storage, with high availability/disaster recovery capabilities.
+Data Lake Storage Gen2 converges the capabilities of [Azure Data Lake Storage Gen1](../../data-lake-store/index.yml) with Azure Blob Storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Because these capabilities are built on Blob storage, you also get low-cost, tiered storage, with high availability/disaster recovery capabilities.
 
 ## Designed for enterprise big data analytics
 
@@ -81,6 +81,7 @@ Several open source platforms support Data Lake Storage Gen2. For a complete lis
 
 ## See also
 
+- [Introduction to Azure Data Lake Storage Gen2 (Training module)](/training/modules/introduction-to-azure-data-lake-storage/)
 - [Best practices for using Azure Data Lake Storage Gen2](data-lake-storage-best-practices.md)
 - [Known issues with Azure Data Lake Storage Gen2](data-lake-storage-known-issues.md)
 - [Multi-protocol access on Azure Data Lake Storage](data-lake-storage-multi-protocol-access.md)
@@ -7,14 +7,14 @@ author: normesta
 ms.subservice: data-lake-storage-gen2
 ms.service: storage
 ms.topic: conceptual
-ms.date: 02/25/2020
+ms.date: 03/09/2023
 ms.author: normesta
 ms.reviewer: stewu
 ---
 
 # Multi-protocol access on Azure Data Lake Storage
 
-Blob APIs now work with accounts that have a hierarchical namespace. This unlocks the ecosystem of tools, applications, and services, as well as several Blob storage features to accounts that have a hierarchical namespace.
+Blob APIs work with accounts that have a hierarchical namespace. This unlocks the ecosystem of tools, applications, and services, as well as several Blob storage features to accounts that have a hierarchical namespace. 
 
 Until recently, you might have had to maintain separate storage solutions for object storage and analytics storage. That's because Azure Data Lake Storage Gen2 had limited ecosystem support. It also had limited access to Blob service features such as diagnostic logging. A fragmented storage solution is hard to maintain because you have to move data between accounts to accomplish various scenarios. You no longer have to do that.
 
@@ -23,7 +23,7 @@ With multi-protocol access on Data Lake Storage, you can work with your data by
 Blob storage features such as [diagnostic logging](../common/storage-analytics-logging.md), [access tiers](access-tiers-overview.md), and [Blob storage lifecycle management policies](./lifecycle-management-overview.md) now work with accounts that have a hierarchical namespace. Therefore, you can enable hierarchical namespaces on your blob Storage accounts without losing access to these important features.
 
 > [!NOTE]
-> Multi-protocol access on Data Lake Storage is generally available and is available in all regions. Some Azure services or Blob storage features enabled by multi-protocol access remain in preview. These articles summarize the current support for Blob storage features and Azure service integrations.
+> Some Azure services or Blob storage features enabled by multi-protocol access remain in preview. These articles summarize the current support for Blob storage features and Azure service integrations.
 >
 > [Blob Storage feature support in Azure Storage accounts](storage-feature-support-in-storage-accounts.md)
 >
 
@@ -1,12 +1,12 @@
 ---
-title: Azure Data Lake Storage Gen2 Hierarchical Namespace
+title: Azure Data Lake Storage Gen2 hierarchical namespace
 titleSuffix: Azure Storage
 description: Describes the concept of a hierarchical namespace for Azure Data Lake Storage Gen2
 author: normesta
 
 ms.service: storage
 ms.topic: conceptual
-ms.date: 10/22/2021
+ms.date: 03/09/2023
 ms.author: normesta
 ms.reviewer: jamesbak
 ms.subservice: data-lake-storage-gen2
@@ -44,5 +44,5 @@ To analyze differences in data storage prices, transaction prices, and storage c
 
 ## Next steps
 
-- Enable a hierarchical namespace when you create a new storage account. See [Create a Storage account](../common/storage-account-create.md).
+- Enable a hierarchical namespace when you create a new storage account. See [Create a storage account to use with Azure Data Lake Storage Gen2](create-data-lake-storage-account.md).
 - Enable a hierarchical namespace on an existing storage account. See [Upgrade Azure Blob Storage with Azure Data Lake Storage Gen2 capabilities](upgrade-to-data-lake-storage-gen2-how-to.md).