Skip to content

Commit 9c514f8

Browse files
committed
Trying something out
1 parent eab2ee3 commit 9c514f8

File tree

1 file changed

+22
-1
lines changed

1 file changed

+22
-1
lines changed

articles/storage/blobs/data-lake-storage-introduction.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,25 @@ Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data anal
1818

1919
Data Lake Storage Gen2 converges the capabilities of [Azure Data Lake Storage Gen1](../../data-lake-store/index.yml) with Azure Blob Storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Because these capabilities are built on Blob storage, you also get low-cost, tiered storage, with high availability/disaster recovery capabilities.
2020

21+
## What is a data lake?
22+
23+
A *data lake* is a single, centralized repository where you can store all your data, both structured and unstructured. A data lake enables your organization to quickly and more easily store, access, and analyze a wide variety of data in a single location. With a data lake, you don't need to conform your data to fit an existing structure. Instead, you can store your data in its raw or native format, usually as files or as binary large objects (blobs).
24+
25+
When evaluating whether a data lake is the correct solution for your company, you should consider several elements as described in the following table.
26+
27+
| **Element** | **Description** |
28+
| --- | --- |
29+
| **Data speed** | A data lake must be able to ingest data at any speed: from the occasional file, to large relational data imports, to real-time data generated by web server logs or IoT devices. |
30+
| **Data scalability** | A data lake might be required to store massive amounts of data that arrive in real time. Thus, the storage must be highly scalable to keep up with demand. |
31+
| **Data availability** | After the data is stored in a data lake, it must be readily available via browsing, searching, and indexing. |
32+
| **Data security** | Most data lakes store crucial data assets, including line-of-business (LOB) data, company-developed apps, and productivity output. The data lake requires robust security to protect these assets. |
33+
| **Data analytics** | A data lake must store data in a way that enables users to use their preferred tools to analyze the data in place. Business analysts, data scientists, and AI modelers need to use their own tools to derive business intelligence, insights, trends, and forecasts. |
34+
| | |
35+
36+
## Azure Data Lake Storage definition
37+
38+
*Azure Data Lake Storage* is a cloud-based, enterprise data lake solution. It's engineered to store massive amounts of data in any format, and to facilitate big data analytical workloads. You use it to capture data of any type and ingestion speed in a single location for easy access and analysis using various frameworks. The current implementation of Azure Data Lake Storage is Azure Data Lake Storage Gen2 and it is not a dedicated service. Data Lake Storage Gen2 is implemented as a set of capabilities that are built on top of the Azure Blob Storage service. The previous implementation, named Azure Data Lake Storage Gen1, is a dedicated service separate from Azure Storage. This service is scheduled to be retired on February 29, 2024.
39+
2140
## Designed for enterprise big data analytics
2241

2342
Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on Azure. Designed from the start to service multiple petabytes of information while sustaining hundreds of gigabits of throughput, Data Lake Storage Gen2 allows you to easily manage massive amounts of data.
@@ -34,7 +53,9 @@ Data Lake Storage Gen2 builds on Blob storage and enhances performance, manageme
3453

3554
Also, Data Lake Storage Gen2 is very cost effective because it's built on top of the low-cost [Azure Blob Storage](storage-blobs-introduction.md). The extra features further lower the total cost of ownership for running big data analytics on Azure.
3655

37-
## Key features of Data Lake Storage Gen2
56+
## Data Lake Storage Gen2 capabilities
57+
58+
Add those paragraphs from the learn module
3859

3960
- **Hadoop compatible access:** Data Lake Storage Gen2 allows you to manage and access data just as you would with a [Hadoop Distributed File System (HDFS)](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html). The [ABFS driver](data-lake-storage-abfs-driver.md) (used to access data) is available within all Apache Hadoop environments. These environments include [Azure HDInsight](../../hdinsight/index.yml)*,* [Azure Databricks](/azure/databricks/), and [Azure Synapse Analytics](../../synapse-analytics/index.yml).
4061

0 commit comments

Comments
 (0)