Merge pull request #259814 from seesharprun/cosmos-build-validation-fixes

denrea · web-flow · commit 3d94585d8174 · 2023-11-29T16:32:18.000-08:00
Cosmos DB | Fix build validation issues
diff --git a/articles/cosmos-db/cassandra/spark-databricks.md b/articles/cosmos-db/cassandra/spark-databricks.md
@@ -45,7 +45,7 @@ This article details how to work with Azure Cosmos DB for Apache Cassandra from
 
 * **Cassandra Spark connector:** - To integrate Azure Cosmos DB for Apache Cassandra with Spark, the Cassandra connector should be attached to the Azure Databricks cluster. To attach the cluster:
 
-  * Review the Databricks runtime version, the Spark version. Then find the [maven coordinates](https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector-assembly) that are compatible with the Cassandra Spark connector, and attach it to the cluster. See ["Upload a Maven package or Spark package"](https://docs.databricks.com/user-guide/libraries.html) article to attach the connector library to the cluster. We recommend selecting Databricks runtime version 10.4 LTS, which supports Spark 3.2.1. To add the Apache Spark Cassandra Connector, your cluster, select **Libraries** > **Install New** > **Maven**, and then add `com.datastax.spark:spark-cassandra-connector-assembly_2.12:3.2.0` in Maven coordinates. If using Spark 2.x, we recommend an environment with Spark version 2.4.5, using spark connector at maven coordinates `com.datastax.spark:spark-cassandra-connector_2.11:2.4.3`.
+  * Review the Databricks runtime version, the Spark version. Then find the [maven coordinates](https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector-assembly) that are compatible with the Cassandra Spark connector, and attach it to the cluster. See ["Upload a Maven package or Spark package"](https://docs.databricks.com/libraries) article to attach the connector library to the cluster. We recommend selecting Databricks runtime version 10.4 LTS, which supports Spark 3.2.1. To add the Apache Spark Cassandra Connector, your cluster, select **Libraries** > **Install New** > **Maven**, and then add `com.datastax.spark:spark-cassandra-connector-assembly_2.12:3.2.0` in Maven coordinates. If using Spark 2.x, we recommend an environment with Spark version 2.4.5, using spark connector at maven coordinates `com.datastax.spark:spark-cassandra-connector_2.11:2.4.3`.
 
 * **Azure Cosmos DB for Apache Cassandra-specific library:** - If you're using Spark 2.x, a custom connection factory is required to configure the retry policy from the Cassandra Spark connector to Azure Cosmos DB for Apache Cassandra. Add the `com.microsoft.azure.cosmosdb:azure-cosmos-cassandra-spark-helper:1.2.0`[maven coordinates](https://search.maven.org/artifact/com.microsoft.azure.cosmosdb/azure-cosmos-cassandra-spark-helper/1.2.0/jar) to attach the library to the cluster.
 
diff --git a/articles/cosmos-db/nosql/migrate-relational-data.md b/articles/cosmos-db/nosql/migrate-relational-data.md
@@ -119,7 +119,7 @@ We can also use Spark in [Azure Databricks](https://azure.microsoft.com/services
 > [!NOTE]
 > For clarity and simplicity, the code snippets include dummy database passwords explicitly inline, but you should ideally use Azure Databricks secrets.
 
-First, we create and attach the required [SQL connector](/connectors/sql/) and [Azure Cosmos DB connector](https://docs.databricks.com/data/data-sources/azure/cosmosdb-connector.html) libraries to our Azure Databricks cluster. Restart the cluster to make sure libraries are loaded.
+First, we create and attach the required [SQL connector](/connectors/sql/) and [Azure Cosmos DB connector](/azure/databricks/external-data/cosmosdb-connector) libraries to our Azure Databricks cluster. Restart the cluster to make sure libraries are loaded.
 
 :::image type="content" source="./media/migrate-relational-data/databricks1.png" alt-text="Screenshot that shows where to create and attach the required SQL connector and Azure Cosmos DB connector libraries to our Azure Databricks cluster.":::
 
diff --git a/articles/cosmos-db/postgresql/concepts-sharding-models.md b/articles/cosmos-db/postgresql/concepts-sharding-models.md
@@ -51,18 +51,16 @@ Drawbacks:
 
 ## Sharding tradeoffs
 
-<br />
-
-|| Schema-based sharding | Row-based sharding|
-|---|---|---|
-|Multi-tenancy model|Separate schema per tenant|Shared tables with tenant ID columns|
-|Citus version|12.0+|All versions|
-|Extra steps compared to vanilla PostgreSQL|None, only a config change|Use create_distributed_table on each table to distribute & colocate tables by tenant ID|
-|Number of tenants|1-10k|1-1 M+|
-|Data modeling requirement|No foreign keys across distributed schemas|Need to include a tenant ID column (a distribution column, also known as a sharding key) in each table, and in primary keys, foreign keys|
-|SQL requirement for single node queries|Use a single distributed schema per query|Joins and WHERE clauses should include tenant_id column|
-|Parallel cross-tenant queries|No|Yes|
-|Custom table definitions per tenant|Yes|No|
-|Access control|Schema permissions|Schema permissions|
-|Data sharing across tenants|Yes, using reference tables (in a separate schema)|Yes, using reference tables|
-|Tenant to shard isolation|Every tenant has its own shard group by definition|Can give specific tenant IDs their own shard group via isolate_tenant_to_new_shard|
+| | Schema-based sharding | Row-based sharding |
+| --- | --- | --- |
+| **Multi-tenancy model** | Separate schema per tenant | Shared tables with tenant ID columns |
+| **Citus version** | 12.0+ | All versions |
+| **Extra steps compared to vanilla PostgreSQL** | None, only a config change | Use create_distributed_table on each table to distribute & colocate tables by tenant ID |
+| **Number of tenants** | 1-10k | 1-1 M+ |
+| **Data modeling requirement** | No foreign keys across distributed schemas | Need to include a tenant ID column (a distribution column, also known as a sharding key) in each table, and in primary keys, foreign keys |
+| **SQL requirement for single node queries** | Use a single distributed schema per query | Joins and WHERE clauses should include tenant_id column |
+| **Parallel cross-tenant queries** | No | Yes |
+| **Custom table definitions per tenant** | Yes | No |
+| **Access control** | Schema permissions | Schema permissions |
+| **Data sharing across tenants** | Yes, using reference tables (in a separate schema) | Yes, using reference tables |
+| **Tenant to shard isolation** | Every tenant has its own shard group by definition | Can give specific tenant IDs their own shard group via isolate_tenant_to_new_shard |