You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/storage/blobs/data-lake-storage-introduction.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,6 @@ Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data anal
18
18
19
19
Data Lake Storage Gen2 converges the capabilities of [Azure Data Lake Storage Gen1](../../data-lake-store/index.yml) with Azure Blob Storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Because these capabilities are built on Blob storage, you also get low-cost, tiered storage, with high availability/disaster recovery capabilities.
20
20
21
-
22
21
Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on Azure. Designed from the start to service multiple petabytes of information while sustaining hundreds of gigabits of throughput, Data Lake Storage Gen2 allows you to easily manage massive amounts of data.
23
22
24
23
## What is a Data Lake?
@@ -31,7 +30,7 @@ _Azure Data Lake Storage_ is a cloud-based, enterprise data lake solution. It's
31
30
32
31
_Azure Data Lake Storage Gen2_ refers to the current implementation of Azure's Data Lake Storage solution. The previous implementation, _Azure Data Lake Storage Gen1_, is scheduled to be retired on February 29, 2024.
33
32
34
-
Unlike Data Lake Storage Gen1, Data Lake Storage Gen2 isn't a dedicated service or account type. Instead, it's implemented as a set of capabilities that you use with the Blob Storage service of your Azure Storage account. You can unlock these capabilities by enabling the hierarchical namespace setting. This setting is not enabled by default. You can enable it when you create the account or after you created the account by using an account upgrade tool that you can run from the settings of your account.
33
+
Unlike Data Lake Storage Gen1, Data Lake Storage Gen2 isn't a dedicated service or account type. Instead, it's implemented as a set of capabilities that you use with the Blob Storage service of your Azure Storage account. You can unlock these capabilities by enabling the hierarchical namespace setting. This setting is not enabled by default. You must enable it either as you create the account or after you create the account.
35
34
36
35
## Data Lake Storage Gen2 capabilities
37
36
@@ -41,7 +40,7 @@ Azure Data Lake Storage Gen2 is primarily designed to work with Hadoop and all f
41
40
42
41
Data analysis frameworks that use HDFS as their data access layer can directly access Azure Data Lake Storage Gen2 data through ABFS. The Apache Spark analytics engine and the Presto SQL query engine are examples of such frameworks.
43
42
44
-
For more information, see [Azure services that support Azure Data Lake Storage Gen2](data-lake-storage-supported-azure-services) and [Open source platforms that support Azure Data Lake Storage Gen2](data-lake-storage-supported-open-source-platforms.md).
43
+
For more information, see [Azure services that support Azure Data Lake Storage Gen2](data-lake-storage-supported-azure-services.md) and [Open source platforms that support Azure Data Lake Storage Gen2](data-lake-storage-supported-open-source-platforms.md).
45
44
46
45
#### Hierarchical directory structure
47
46
@@ -69,11 +68,13 @@ This design means that Azure Data Lake Storage Gen2 can easily and quickly scale
69
68
70
69
## Built on Azure Blob Storage
71
70
72
-
The data that you ingest persist as blobs in the storage account. The service that manages those blobs is the Azure Blob Storage service. Data Lake Storage Gen2 describes the capabilities or "enhancements" to this service that cater to the demands of big data analytic workloads. The Data Lake Storage Gen2 documentation provides best practices and guidance for using these capabilities.
71
+
The data that you ingest persist as blobs in the storage account. The service that manages those blobs is the Azure Blob Storage service. Data Lake Storage Gen2 describes the capabilities or "enhancements" to this service that cater to the demands of big data analytic workloads.
72
+
73
+
Because these capabilities are built on Blob Storage, features such as diagnostic logging, access tiers, and lifecycle management policies are available to your account. Most Blob Storage features are fully supported, but some features might be supported only at the preview level and there are a handful of them that are not yet supported. For a complete list of support statements, see [Blob Storage feature support in Azure Storage accounts](storage-feature-support-in-storage-accounts.md). The status of each listed feature will change over time as support continues to expand.
73
74
74
-
Because these capabilities are built on Blob Storage, features such as diagnostic logging, access tiers, and lifecycle management policies are available to your account.
75
+
## Documentation and terminology
75
76
76
-
Most Blob Storage features are fully supported, but some features might be supported only at the preview level and there are a handful of them that are not yet supported. For a complete list of support statements, see [Blob Storage feature support in Azure Storage accounts](storage-feature-support-in-storage-accounts.md). The status of each listed feature will change over time as support continues to expand. The [Blob storage documentation](storage-blobs-introduction.md) provides guidance for account features not specific to Data Lake Storage Gen2.
77
+
The Data Lake Storage Gen2 documentation provides best practices and guidance for using Data Lake Storage Gen2 capabilities. The [Blob Storage documentation](storage-blobs-introduction.md) provides guidance for account features not specific to Data Lake Storage Gen2.
77
78
78
79
As you move between content sets, you notice some slight terminology differences. For example, content featured in the [Blob storage documentation](storage-blobs-introduction.md), will use the term _blob_ instead of _file_. Technically, the files that you ingest to your storage account become blobs in your account. Therefore, the term is correct. However, the term _blob_ can cause confusion if you're used to the term _file_. You'll also see the term _container_ used to refer to a _file system_. Consider these terms as synonymous.
0 commit comments