Skip to content

Commit 1f21cce

Browse files
Updates for Ignite
1 parent 296d0bb commit 1f21cce

File tree

1 file changed

+49
-10
lines changed

1 file changed

+49
-10
lines changed

articles/cosmos-db/custom-partitioning-analytical-store.md

Lines changed: 49 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,11 @@ In this article, you will learn how to partition your data in Azure Cosmos DB an
2020
> Custom partitioning feature is currently in public preview. This preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
2121
2222
> [!NOTE]
23-
> Azure Cosmos DB accounts should have Azure Synapse Link(synapse-link.md) enabled to take advantage of custom partitioning. Custom partitioning is currently supported for Azure Synapse Spark 2.0 only.
23+
> Azure Cosmos DB accounts should have [Azure Synapse Link](synapse-link.md) enabled to take advantage of custom partitioning. Custom partitioning is currently supported for Azure Synapse Spark 2.0 only.
2424
2525
## How does it work?
2626

27-
Analytical store partitioning is independent of partitioning in the transactional store. By default, analytical store is not partitioned. If you want to query analytical store frequently based on fields such as Date, Time, Category etc. you may benefit from creating a partitioned store based on these keys.
28-
29-
With custom partitioning, you can choose a single field or a combination of fields from your dataset as the analytical store partition key.
27+
Analytical store partitioning is independent of partitioning in the transactional store. By default, analytical store is not partitioned. If you want to query analytical store frequently based on fields such as Date, Time, Category etc. you leverage custom partitioning to create a separate partitioned store based on these keys. You can choose a single field or a combination of fields from your dataset as the analytical store partition key.
3028

3129
You can trigger partitioning from an Azure Synapse Spark notebook using Azure Synapse Link. You can schedule it to run as a background job, once or twice a day but can be executed more often, if needed.
3230

@@ -76,7 +74,49 @@ If you configured [managed private endpoints](analytical-store-private-endpoints
7674

7775
Similarly, if you configured [customer-managed keys on analytical store](how-to-setup-cmk.md#is-it-possible-to-use-customer-managed-keys-in-conjunction-with-the-azure-cosmos-db-analytical-store), you must directly enable it on the Synapse workspace primary storage account, which is the partitioned store, as well.
7876

79-
77+
## Partitioning strategies
78+
You could use one or more partition keys for your analytical data. If you are using multiple partiton keys, below are some recommendations on how to partition the data:
79+
- **Using composite keys:**
80+
81+
Say, you want to frequently query based on Key1 and Key2.
82+
83+
For example, "Query for all records where ReadDate = ‘2021-10-08’ and Location = ‘Sydney’".
84+
85+
In this case, using composite keys will be more efficient, to look up all records that match the ReadDate and the records that match Location within that ReadDate.
86+
87+
Sample configuration options:
88+
89+
spark.cosmos.asns.basePath ”/mnt/CosmosDBPartitionedStore/”
90+
spark.cosmos.asns.partition.keys ”ReadDate String, Location String”
91+
92+
Now, on above partitioned store, if you want to only query based on "Location" filter:
93+
* You may want to query analytical store directly. Partitoned store will scan all records by ReadDate first and then by Location.
94+
So, depending on your workload and cardinatlity of your analytical data, you may get better results by querying analytical store directly.
95+
* You could also run another partition job to also partition based on ‘Location’ on the same partitioned store.
96+
97+
* **Using multiple keys separately:**
98+
99+
Say, you want to frequently query sometimes based on 'ReadDate' and other times, based on 'Location'.
100+
101+
For example,
102+
- Query for all records where ReadDate = ‘2021-10-08’
103+
- Query for all records where Location = ‘Sydney’
104+
105+
Run two partition jobs with partition keys as defined below for this scenario:
106+
107+
Job 1:
108+
109+
spark.cosmos.asns.basePath ”/mnt/CosmosDBPartitionedStore/”
110+
spark.cosmos.asns.partition.keys ”ReadDate String”
111+
112+
Job 2:
113+
114+
spark.cosmos.asns.basePath ”/mnt/CosmosDBPartitionedStore/”
115+
spark.cosmos.asns.partition.keys ”Location String”
116+
117+
Please note that it's not efficient to now frequently query based on "ReadDate" and "Location" filters together, on above partitioning. Composite keys will give
118+
better query performance in that case.
119+
80120
## Limitations
81121

82122
* Custom partitioning is only available for Azure Synapse Spark. Custom partitioning is currently not supported for serverless SQL pools.
@@ -116,13 +156,12 @@ Yes, the partition key for the given container can be changed and the new partit
116156
117157
### Can different partition keys point to the same BasePath?
118158

119-
Yes, since the partition key definition is part of the partitioned store path, different partition keys will have different paths branching from the same BasePath.
159+
Yes, you can specify multiple partition keys on the same partitioned store as below:
120160

121-
Base path format could be specified as: /mnt/partitionedstorename/\<Cosmos_DB_account_name\>/\<Cosmos_DB_database_rid\>/\<Cosmos_DB_container_rid\>/partition=partitionkey/
122161

123-
For example:
124-
/mnt/CosmosDBPartitionedStore/store_sales/…/partition=sold_date/...
125-
/mnt/CosmosDBPartitionedStore/store_sales/…/partition=Date/...
162+
spark.cosmos.asns.basePath ”/mnt/CosmosDBPartitionedStore/”
163+
spark.cosmos.asns.partition.keys ”ReadDate String, Location String”
164+
126165

127166
## Next steps
128167

0 commit comments

Comments
 (0)