Skip to content

Commit e81ba43

Browse files
committed
consolidating the table of links
1 parent e41fc5b commit e81ba43

File tree

1 file changed

+17
-18
lines changed

1 file changed

+17
-18
lines changed

articles/storage/blobs/data-lake-storage-best-practices.md

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -148,31 +148,30 @@ Then, review the [Access control model in Azure Data Lake Storage Gen2](data-lak
148148

149149
## Ingest, process, and analyze
150150

151-
Something here about documentation not being something that we do in this collection.
151+
There are many different sources of data and the different ways in which that data can be ingested into a Data Lake Storage Gen2 enabled account.
152152

153-
#### Ingesting data
153+
You can also ingest large sets of data from HD Insight or Hadoop clusters or smaller sets of *ad hoc* data for prototyping applications.
154154

155-
There are many different sources of data and the different ways in which that data can be ingested into a Data Lake Storage Gen2 enabled account. This table presents some common sources and the tools that we recommend for each source
155+
Streamed data is generated by various sources such as applications, devices, and sensors. You can use tools to capture and process the data on an event-by-event basis in real time, and then write the events in batches into your account.
156156

157-
| Data source | Recommended tools |
158-
|---|---|
159-
| Ad hoc<br><br>Smaller sets of data for prototyping applications | <li>Azure portal</li><li>[Azure PowerShell](data-lake-storage-directory-file-acl-powershell.md)</li><li>[Azure CLI](data-lake-storage-directory-file-acl-cli.md)</li><li>[REST](/rest/api/storageservices/data-lake-storage-gen2)</li><li>[Azure Storage Explorer](https://azure.microsoft.com/features/storage-explorer/)</li><li>[Apache DistCp](data-lake-storage-use-distcp.md)</li><li>[AzCopy](../common/storage-use-azcopy-v10.md)</li>|
160-
| Streamed<br><br>Generated by various sources such as applications, devices, and sensors. Tools used to ingest this type of data usually capture and process the data on an event-by-event basis in real time, and then write the events in batches into your account. | <li>[HDInsight Storm](../../hdinsight/storm/apache-storm-write-data-lake-store.md)</li><li>[Azure Stream Analytics](../../stream-analytics/stream-analytics-quick-create-portal.md)</li> |
161-
| Relational<br><br>Records from relational databases || [Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md) |
162-
| Web server logs<br><br>These files contain information such as the history of page requests. Consider writing custom scripts or applications to upload this data so you'll have the flexibility to include your data uploading component as part of your larger big data application. | <li>[Azure PowerShell](data-lake-storage-directory-file-acl-powershell.md)</li><li>[Azure CLI](data-lake-storage-directory-file-acl-cli.md)</li><li>[REST](/rest/api/storageservices/data-lake-storage-gen2)</li><li>Azure SDKs ([.NET](data-lake-storage-directory-file-acl-dotnet.md), [Java](data-lake-storage-directory-file-acl-java.md), [Python](data-lake-storage-directory-file-acl-python.md), and [Node.js](data-lake-storage-directory-file-acl-javascript.md))</li><li>[Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md)</li> |
163-
| HD Insight<br><br>Data from HDInsight cluster types (For example: Hadoop, HBase, Storm) | <li>[Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md)</li><li>[Apache DistCp](data-lake-storage-use-distcp.md)</li><li>[AzCopy](../common/storage-use-azcopy-v10.md)</li> |
164-
| Hadoop clusters<br><br>Running on-premise or in the cloud | <li>[Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md)</li><li>[Apache DistCp](data-lake-storage-use-distcp.md)</li><li>[WANdisco LiveData Migrator for Azure](migrate-gen2-wandisco-live-data-platform.md)</li><li>[Azure Data Box](data-lake-storage-migrate-on-premises-hdfs-cluster.md)</li> |
165-
| Large data sets<br><br>Data sets that range in several terabytes | [Azure ExpressRoute documentation](../../expressroute/expressroute-introduction.md) |
157+
Web server logs contain information such as the history of page requests. Consider writing custom scripts or applications to upload web server logs so you'll have the flexibility to include your data uploading component as part of your larger big data application.
166158

167-
#### Process, analyze, visualize, and download
159+
Once the data is available in Data Lake Storage Gen2 you can run analysis on that data, create visualizations, and even download data to your local machine or to other repositories such as an Azure SQL database or SQL Server instance.
168160

169-
Once the data is available in Data Lake Storage Gen2 you can run analysis on that data, create visualizations, and even download data to your local machine or to other repositories such as an Azure SQL database or SQL Server instance. The following sections recommend tools that you can use to analyze, visualize, and download data.
161+
The following table recommend tools that you can use to ingest, analyze, visualize, and download data. Use the links in this table to find guidance about how to configure and use each tool.
170162

171-
| Purpose | Recommended tool |
163+
| Purpose | Recommended tools |
172164
|---|---|
173-
| Process / Analyze | <li>[Azure Synapse Analytics](../../synapse-analytics/get-started-analyze-storage.md)</li><li>[Azure HDInsight](../../hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2.md)</li><li>[Databricks](/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouse) |
174-
| Visualize | <li>[Power BI](/power-query/connectors/datalakestorage)</li><li>[Azure Data Lake Storage query acceleration](data-lake-storage-query-acceleration.md)</li> |
175-
| Download | <li>Azure portal</li><li>[PowerShell](data-lake-storage-directory-file-acl-powershell.md)</li><li>[Azure CLI](data-lake-storage-directory-file-acl-cli.md)</li><li>[REST](/rest/api/storageservices/data-lake-storage-gen2)</li><li>Azure SDKs ([.NET](data-lake-storage-directory-file-acl-dotnet.md), [Java](data-lake-storage-directory-file-acl-java.md), [Python](data-lake-storage-directory-file-acl-python.md), and [Node.js](data-lake-storage-directory-file-acl-javascript.md))</li><li>[Azure Storage Explorer](data-lake-storage-explorer.md)</li><li>[AzCopy](../common/storage-use-azcopy-v10.md#transfer-data)</li><li>[Azure Data Factory](../../data-factory/copy-activity-overview.md)</li><li>[Apache DistCp](./data-lake-storage-use-distcp.md) |
165+
| Ingest ad hoc data| Azure portal, [Azure PowerShell](data-lake-storage-directory-file-acl-powershell.md), [Azure CLI](data-lake-storage-directory-file-acl-cli.md), [REST](/rest/api/storageservices/data-lake-storage-gen2), [Azure Storage Explorer](https://azure.microsoft.com/features/storage-explorer/), [Apache DistCp](data-lake-storage-use-distcp.md), [AzCopy](../common/storage-use-azcopy-v10.md)|
166+
| Ingest streaming data | [HDInsight Storm](../../hdinsight/storm/apache-storm-write-data-lake-store.md), [Azure Stream Analytics](../../stream-analytics/stream-analytics-quick-create-portal.md) |
167+
| Ingest relational data | [Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md) |
168+
| Ingest web server logs | [Azure PowerShell](data-lake-storage-directory-file-acl-powershell.md), [Azure CLI](data-lake-storage-directory-file-acl-cli.md), [REST](/rest/api/storageservices/data-lake-storage-gen2), Azure SDKs ([.NET](data-lake-storage-directory-file-acl-dotnet.md), [Java](data-lake-storage-directory-file-acl-java.md), [Python](data-lake-storage-directory-file-acl-python.md), and [Node.js](data-lake-storage-directory-file-acl-javascript.md)), [Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md) |
169+
| Ingest from HD Insight clusters | [Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md), [Apache DistCp](data-lake-storage-use-distcp.md), [AzCopy](../common/storage-use-azcopy-v10.md) |
170+
| Ingest from Hadoop clusters | [Azure Data Factory](../../data-factory/connector-azure-data-lake-store.md), [Apache DistCp](data-lake-storage-use-distcp.md), [WANdisco LiveData Migrator for Azure](migrate-gen2-wandisco-live-data-platform.md), [Azure Data Box](data-lake-storage-migrate-on-premises-hdfs-cluster.md) |
171+
| Ingest large data sets (several terabytes) | [Azure ExpressRoute](../../expressroute/expressroute-introduction.md) |
172+
| Process & analyze data | [Azure Synapse Analytics](../../synapse-analytics/get-started-analyze-storage.md), [Azure HDInsight](../../hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2.md), [Databricks](/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouse) |
173+
| Visualize data | [Power BI](/power-query/connectors/datalakestorage), [Azure Data Lake Storage query acceleration](data-lake-storage-query-acceleration.md) |
174+
| Download data | Azure portal, [PowerShell](data-lake-storage-directory-file-acl-powershell.md), [Azure CLI](data-lake-storage-directory-file-acl-cli.md), [REST](/rest/api/storageservices/data-lake-storage-gen2), Azure SDKs ([.NET](data-lake-storage-directory-file-acl-dotnet.md), [Java](data-lake-storage-directory-file-acl-java.md), [Python](data-lake-storage-directory-file-acl-python.md), and [Node.js](data-lake-storage-directory-file-acl-javascript.md)), [Azure Storage Explorer](data-lake-storage-explorer.md), [AzCopy](../common/storage-use-azcopy-v10.md#transfer-data), [Azure Data Factory](../../data-factory/copy-activity-overview.md), [Apache DistCp](./data-lake-storage-use-distcp.md) |
176175

177176

178177
## Monitor telemetry

0 commit comments

Comments
 (0)