You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/synapse-link/how-to-query-analytical-store-spark-3.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,23 +31,23 @@ The following sections walk you through the syntax of above capabilities. You ca
31
31
32
32
Customers can load analytical store data to Spark DataFrames or create Spark tables.
33
33
34
-
The difference in experience is around whether underlying data changes in the Azure Cosmos DB container should be automatically reflected in the analysis performed in Spark. When either a Spark DataFrame is registered or a Spark table is created against a container's analytical store, metadata around the current snapshot of data in the analytical store is fetched to Spark for efficient pushdown of subsequent analysis. It is important to note that since Spark follows a lazy evaluation policy, unless an action is invoked on the Spark DataFrame or a SparkSQL query is executed against the Spark table, actual data is not fetched from the underlying container's analytical store.
34
+
The difference in experience is around whether underlying data changes in the Azure Cosmos DB container should be automatically reflected in the analysis performed in Spark. When Spark DataFrames are registered, or a Spark table is created, metadata around the current snapshot of the analytical store is fetched to Spark for efficient pushdown. It is important to note that since Spark follows a lazy evaluation policy. Unless an action is invoked on the Spark DataFrame, or a SparkSQL query is executed, actual data is not fetched from analytical store.
35
35
36
36
In the case of **loading to Spark DataFrame**, the fetched metadata is cached through the lifetime of the Spark session and hence subsequent actions invoked on the DataFrame are evaluated against the snapshot of the analytical store at the time of DataFrame creation.
37
37
38
38
On the other hand, in the case of **creating a Spark table**, the metadata of the analytical store state is not cached in Spark and is reloaded on every SparkSQL query execution against the Spark table.
39
39
40
-
Thus, you can choose between loading to Spark DataFrame and creating a Spark table based on whether you want your Spark analysis to be evaluated against a fixed snapshot of the analytical store or against the latest snapshot of the analytical store respectively.
40
+
To conclude, you can choose between loading a snapshot to Spark DataFrame or querying a Spark table for the latest snapshot.
41
41
42
42
> [!NOTE]
43
43
> To query Azure Cosmos DB for MongoDB accounts, learn more about the [full fidelity schema representation](/azure/cosmos-db/analytical-store-introduction#analytical-schema) in the analytical store and the extended property names to be used.
44
44
45
45
> [!NOTE]
46
-
> All `options`in the commands below are case sensitive.
46
+
> All `options` are case sensitive.
47
47
48
48
## Authentication
49
49
50
-
Now Spark 3.x customers can authenticate to Azure Cosmos DB analytical store using access tokens and database account keys, that are more secure as they are short lived, meaning less risk sincee it can only be generated by trusted identities, which have been approved by assigning them the required permission using Cosmos DB RBAC.
50
+
Now Spark 3.x customers can authenticate to Azure Cosmos DB analytical store using trusted identities access tokens or database account keys. Tokens are more secure as they are short lived, and assigned to the required permission using Cosmos DB RBAC.
51
51
52
52
The connector now supports two auth types, `MasterKey` and `AccessToken` for the `spark.cosmos.auth.type` property.
0 commit comments