Skip to content

Commit 31012a3

Browse files
authored
Revise Spark DataFrame and table descriptions
1 parent 01510ce commit 31012a3

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/synapse-analytics/synapse-link/how-to-query-analytical-store-spark-3.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,23 +31,23 @@ The following sections walk you through the syntax of above capabilities. You ca
3131

3232
Customers can load analytical store data to Spark DataFrames or create Spark tables.
3333

34-
The difference in experience is around whether underlying data changes in the Azure Cosmos DB container should be automatically reflected in the analysis performed in Spark. When either a Spark DataFrame is registered or a Spark table is created against a container's analytical store, metadata around the current snapshot of data in the analytical store is fetched to Spark for efficient pushdown of subsequent analysis. It is important to note that since Spark follows a lazy evaluation policy, unless an action is invoked on the Spark DataFrame or a SparkSQL query is executed against the Spark table, actual data is not fetched from the underlying container's analytical store.
34+
The difference in experience is around whether underlying data changes in the Azure Cosmos DB container should be automatically reflected in the analysis performed in Spark. When Spark DataFrames are registered, or a Spark table is created, metadata around the current snapshot of the analytical store is fetched to Spark for efficient pushdown. It is important to note that since Spark follows a lazy evaluation policy. Unless an action is invoked on the Spark DataFrame, or a SparkSQL query is executed, actual data is not fetched from analytical store.
3535

3636
In the case of **loading to Spark DataFrame**, the fetched metadata is cached through the lifetime of the Spark session and hence subsequent actions invoked on the DataFrame are evaluated against the snapshot of the analytical store at the time of DataFrame creation.
3737

3838
On the other hand, in the case of **creating a Spark table**, the metadata of the analytical store state is not cached in Spark and is reloaded on every SparkSQL query execution against the Spark table.
3939

40-
Thus, you can choose between loading to Spark DataFrame and creating a Spark table based on whether you want your Spark analysis to be evaluated against a fixed snapshot of the analytical store or against the latest snapshot of the analytical store respectively.
40+
To conclude, you can choose between loading a snapshot to Spark DataFrame or querying a Spark table for the latest snapshot.
4141

4242
> [!NOTE]
4343
> To query Azure Cosmos DB for MongoDB accounts, learn more about the [full fidelity schema representation](/azure/cosmos-db/analytical-store-introduction#analytical-schema) in the analytical store and the extended property names to be used.
4444
4545
> [!NOTE]
46-
> All `options` in the commands below are case sensitive.
46+
> All `options` are case sensitive.
4747
4848
## Authentication
4949

50-
Now Spark 3.x customers can authenticate to Azure Cosmos DB analytical store using access tokens and database account keys, that are more secure as they are short lived, meaning less risk sincee it can only be generated by trusted identities, which have been approved by assigning them the required permission using Cosmos DB RBAC.
50+
Now Spark 3.x customers can authenticate to Azure Cosmos DB analytical store using trusted identities access tokens or database account keys. Tokens are more secure as they are short lived, and assigned to the required permission using Cosmos DB RBAC.
5151

5252
The connector now supports two auth types, `MasterKey` and `AccessToken` for the `spark.cosmos.auth.type` property.
5353

0 commit comments

Comments
 (0)