Skip to content

Commit abc0055

Browse files
authored
Merge pull request #218590 from iriaosara/tokenDoc
Overview for token and token functions
2 parents e8cb04b + 0b8e0c3 commit abc0055

File tree

2 files changed

+77
-0
lines changed

2 files changed

+77
-0
lines changed

articles/cosmos-db/cassandra/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
href: introduction.md
77
- name: Wire protocol support
88
href: support.md
9+
- name: Tokens and the Token function
10+
href: tokens.md
911
- name: FAQ
1012
href: cassandra-faq.yml
1113
- name: Quickstarts
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
title: Tokens and the Token Function in Azure Cosmos DB for Apache Cassandra
3+
description: Tokens and the Token Function in Azure Cosmos DB for Apache Cassandra.
4+
author: IriaOsara
5+
ms.author: iriaosara
6+
ms.service: cosmos-db
7+
ms.subservice: apache-cassandra
8+
ms.topic: overview
9+
ms.date: 11/04/2022
10+
---
11+
12+
# Tokens and the Token Function in Azure Cosmos DB for Apache Cassandra
13+
14+
[!INCLUDE[Cassandra](../includes/appliesto-cassandra.md)]
15+
16+
This article discusses tokens and token function in Azure Cosmos DB for Apache Cassandra and clarifies the difference between the computation and usage of token in native Cassandra.
17+
18+
## What is a Token
19+
20+
A token is a hashed partition key used to distribute data across the cluster. When data is distributed in Apache Cassandra, a range of tokens are assigned to each node, and you can either assign a token range or this can be done by Cassandra. So, when data is ingested, Cassandra can calculate the token and use that in finding the node to store the newly ingested data.
21+
22+
## What is the Token Function
23+
24+
The Token Function is a function available via the CQL API of a Cassandra cluster. It provides a means to expose the partitioning function used by the cluster. As a cql function, Token differs from most other functions, since it restricts the parameters passed to it based on the table that you are querying. The number of parameters allowed for the function equates to the number of partition keys for the table being queried, and the data type of the parameters are also restricted to the data types of the corresponding partition keys.
25+
26+
Note though, this type of restriction on Apache Cassandra is arbitrary, and is only applied on constant values being passed to the function. The most notable usage of the Token function is with applying relations on the token of the partition key. Azure Cosmos DB for Apache Cassandra allows for `SELECT` queries to make use of a `WHERE` clause filtering on the tokens of your data instead of the data itself.
27+
28+
```sql
29+
SELECT token(accountid) FROM uprofile.accounts;
30+
31+
system.token(accountid)
32+
-------------------------
33+
2601062599670757427
34+
2976626013207263698
35+
36+
```
37+
38+
```sql
39+
SELECT token(accountid)
40+
FROM uprofile.accounts
41+
WHERE token(accountid)=2976626013207263698;
42+
43+
name | accountid | state | country
44+
-------+-----------+-------+-------+
45+
Devon | 405 | NYC | USA |
46+
47+
```
48+
49+
> [!NOTE]
50+
> In this usage, only the partition key columns can be specified as parameters to the Token function.
51+
> This usage of the function is merely a placeholder to allow you make filters directly on the partition hash, instead of the partition key value. This is very useful for breaking up scans into sub parts and parallelizing the read of data from a table.
52+
> Also, Azure Cosmos DB for Apache Cassandra does not allow range queries on partition key.
53+
54+
## How Token works in Azure Cosmos DB for Apache Cassandra
55+
56+
Azure Cosmos DB for Apache Cassandra uses the default partitioner, Murmur3Partitioner for native Cassandra. It has better performance than other partitioners and hashes key(s) faster. We use the same Murmur3Partitioner function while having some variants to ensure cross-compatibility across the host of 3rd party tools that work against the default Murmur3Partitioner in Apache Cassandra.
57+
58+
There are certain limitations on usage of the Token function in Cosmos DB’s Cassandra API:
59+
60+
1. The Token function can only be used as a projection on the partition key columns. That is, it can only be used to project the token of the row(s).
61+
2. For a given partition key value, the token value generated on Cosmos DB’s Cassandra API will be different from the token value generated on Apache Cassandra.
62+
3. The usage of the Token function `WHERE` clauses is the same for both Cosmos DB Cassandra and Apache Cassandra.
63+
64+
> [!NOTE]
65+
> The token function should only be used for projecting the actual token(pk) of the row, or for token scans (where it's used in the LHS of where clauses).
66+
67+
### What scenarios are unsupported for Cosmos DB Cassandra API (but are supported on Apache Cassandra)?
68+
The following scenarios are unsupported for Azure Cosmos DB for Apache Cassandra:
69+
1. Token Function used as a projection on non-partition key columns.
70+
2. Token Function used as a projection on constant values.
71+
3. Token Function used on the right-hand side of a Token where clause.
72+
73+
## Next steps
74+
75+
- Get started with [creating a API for Cassandra account, database, and a table](manage-data-python.md) by using a Java application

0 commit comments

Comments
 (0)