Skip to content

Commit e45457d

Browse files
committed
Changes based on Ornat's and Ohad's comments
1 parent 59e7eb9 commit e45457d

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

articles/data-explorer/spark-connector.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,9 @@ This topic describes how to install and configure the Azure Data Explorer Spark
2626

2727
* [Create an Azure Data Explorer cluster and database](/azure/data-explorer/create-cluster-database-portal)
2828
* Create a Spark cluster
29-
* Install Azure Data Explorer connector library
30-
* Pre-built libraries for [Spark 2.4, Scala 2.11](https://github.com/Azure/azure-kusto-spark/releases) and [Maven repo](https://mvnrepository.com/artifact/com.microsoft.azure.kusto/spark-kusto-connector)
29+
* Install Azure Data Explorer connector library:
30+
* Pre-built libraries for [Spark 2.4, Scala 2.11](https://github.com/Azure/azure-kusto-spark/releases)
31+
* [Maven repo](https://mvnrepository.com/artifact/com.microsoft.azure.kusto/spark-kusto-connector)
3132
* [Maven 3.x](https://maven.apache.org/download.cgi) installed
3233

3334
> [!TIP]
@@ -118,7 +119,7 @@ Azure AD application authentication is the simplest and most common authenticati
118119
Grant the following privileges on an Azure Data Explorer cluster:
119120
120121
* For reading (data source), the Azure AD identity must have *viewer* privileges on the target database, or *admin* privileges on the target table.
121-
* For writing (data sink), the Azure AD identity must have *ingestor* privileges on the target database. It must also have *user* privileges on the target database to create new tables. If the target table already exists, you can configure *admin* privileges on the target table.
122+
* For writing (data sink), the Azure AD identity must have *ingestor* privileges on the target database. It must also have *user* privileges on the target database to create new tables. If the target table already exists, you must configure *admin* privileges on the target table.
122123
123124
For more information on Azure Data Explorer principal roles, see [role-based authorization](/azure/kusto/management/access-control/role-based-authorization). For managing security roles, see [security roles management](/azure/kusto/management/security-roles).
124125
@@ -175,7 +176,7 @@ For more information on Azure Data Explorer principal roles, see [role-based aut
175176
import java.util.concurrent.TimeUnit
176177
import org.apache.spark.sql.streaming.Trigger
177178
178-
// Set up a checkpoint and disable codeGen. Set up a checkpoint and disable codeGen as a workaround for an known issue
179+
// Set up a checkpoint and disable codeGen.
179180
spark.conf.set("spark.sql.streaming.checkpointLocation", "/FileStore/temp/checkpoint")
180181
181182
// Write to a Kusto table from a streaming source
@@ -219,7 +220,8 @@ For more information on Azure Data Explorer principal roles, see [role-based aut
219220
display(df2)
220221
```
221222
222-
1. Optional: If **you** provide the transient blob storage (and not Microsoft) for reading [large amounts of data](/azure/kusto/concepts/querylimits), you must provide the storage container SAS key, or storage account name, account key, and container name.
223+
1. Optional: If **you** provide the transient blob storage (and not Azure Data Explorer) the blobs are created are under the caller's responsibility. This includes provisioning the storage, rotating access keys, deleting transient artifacts etc.
224+
The KustoBlobStorageUtils module contains helper functions for deleting blobs based on either account and container coordinates and account credentials, or a full SAS URL with write, read and list permissions. When the corresponding RDD is no longer needed, each transaction stores transient blob artifacts in a separate directory. This directory is captured as part of read-transaction information logs reported on the Spark Driver node.
223225
224226
```scala
225227
// Use either container/account-key/account name, or container SaS
@@ -251,7 +253,7 @@ For more information on Azure Data Explorer principal roles, see [role-based aut
251253
display(dfFiltered)
252254
```
253255
254-
* If **Microsoft** provides the transient blob storage, read from Azure Data Explorer as follows:
256+
* If **Azure Data Explorer** provides the transient blob storage, read from Azure Data Explorer as follows:
255257
256258
```scala
257259
val dfFiltered = df2
@@ -266,4 +268,4 @@ For more information on Azure Data Explorer principal roles, see [role-based aut
266268
## Next steps
267269
268270
* Learn more about the [Azure Data Explorer Spark Connector](https://github.com/Azure/azure-kusto-spark/tree/master/docs)
269-
* [Sample code](https://github.com/Azure/azure-kusto-spark/tree/master/samples/src/main)
271+
* [Sample code for Java and Python](https://github.com/Azure/azure-kusto-spark/tree/master/samples/src/main)

0 commit comments

Comments
 (0)