Skip to content

Commit 151678c

Browse files
authored
Merge pull request #111134 from julieMSFT/20200413_ria_whatislinks
20200413 ria whatislinks
2 parents 3ac8da4 + b2f8431 commit 151678c

File tree

4 files changed

+7
-12
lines changed

4 files changed

+7
-12
lines changed

articles/synapse-analytics/overview-what-is.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,8 @@ Azure Synapse provides a single way for enterprises to manage analytics resource
8181

8282
## Next steps
8383

84-
* Explore [Azure Synapse architecture](https://review.docs.microsoft.com/azure/sql-data-warehouse/massively-parallel-processing-mpp-architecture)
85-
* Quickly [create a SQL pool](https://review.docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/create-data-warehouse-portal)
86-
* [Load sample data](https://review.docs.microsoft.com/azure/sql-data-warehouse/sql-data-warehouse-load-sample-databases)
87-
* Explore [Videos](https://azure.microsoft.com/documentation/videos/index/?services=sql-data-warehouse)
84+
* [Create a workspace](quickstart-create-workspace.md)
85+
* [Use Synapse Studio](quickstart-synapse-studio.md)
86+
* [Create a SQL pool](quickstart-create-sql-pool.md)
87+
* [Use SQL on-demand](quickstart-sql-on-demand.md)
88+
* [Create an Apache Spark pool](quickstart-create-apache-spark-pool.md)

articles/synapse-analytics/spark/synapse-spark-sql-pool-import-export.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The Spark SQL Analytics Connector is designed to efficiently transfer data betwe
1616

1717
## Design
1818

19-
Transferring data between Spark pools and SQL pools can be done using JDBC. However, given two distributed systems such as Spark and SQL pools (which provides massively parallel processing (MPP)), JDBC tends to be a bottleneck with serial data transfer.
19+
Transferring data between Spark pools and SQL pools can be done using JDBC. However, given two distributed systems such as Spark and SQL pools, JDBC tends to be a bottleneck with serial data transfer.
2020

2121
The Spark pools to SQL Analytics Connector is a data source implementation for Apache Spark. It uses the Azure Data Lake Storage Gen 2, and Polybase in SQL pools to efficiently transfer data between the Spark cluster and the SQL Analytics instance.
2222

articles/synapse-analytics/sql/best-practices-sql-pool.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,6 @@ SQL pool supports loading and exporting data through several tools including Azu
5050
> [!NOTE]
5151
> Polybase is the best choice when you are loading or exporting large volumes of data, or you need faster performance.
5252
53-
PolyBase is designed to leverage the MPP (Massively Parallel Processing) architecture of SQL pool and will load and export data more quickly than any other tool.
54-
5553
PolyBase loads can be run using CTAS or INSERT INTO. CTAS will minimize transaction logging and is the fastest way to load your data. Azure Data Factory also supports PolyBase loads and can achieve performance similar to CTAS. PolyBase supports various file formats including Gzip files.
5654

5755
To maximize throughput when using Gzip text files, break up files into 60 or more files to maximize parallelism of your load. For faster total throughput, consider loading data concurrently. Additional information for the topics relevant to this section is included in the following articles:

articles/synapse-analytics/sql/data-load-columnstore-compression.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,6 @@ The trim_reason_desc tells whether the rowgroup was trimmed(trim_reason_desc = N
7272

7373
## How to estimate memory requirements
7474

75-
<!--
76-
To view an estimate of the memory requirements to compress a rowgroup of maximum size into a columnstore index, download and run the view [dbo.vCS_mon_mem_grant](). This view shows the size of the memory grant that a rowgroup requires for compression in to the columnstore.
77-
-->
78-
7975
The maximum required memory to compress one rowgroup is approximately
8076

8177
- 72 MB +
@@ -117,7 +113,7 @@ Another reason to avoid over-partitioning is there is a memory overhead for load
117113

118114
The database shares the memory grant for a query among all the operators in the query. When a load query has complex sorts and joins, the memory available for compression is reduced.
119115

120-
Design the load query to focus only on loading the query. If you need to run transformations on the data, run them separate from the load query. For example, stage the data in a heap table, run the transformations, and then load the staging table into the columnstore index. You can also load the data first and then use the MPP system to transform the data.
116+
Design the load query to focus only on loading the query. If you need to run transformations on the data, run them separate from the load query. For example, stage the data in a heap table, run the transformations, and then load the staging table into the columnstore index.
121117

122118
### Adjust MAXDOP
123119

0 commit comments

Comments
 (0)