From 6bf9e8801bf878011642017e014bea3d23a77cde Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 27 Jun 2024 10:35:37 +0100 Subject: [PATCH 1/9] Adds list API proposal Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 117 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 117 insertions(+) create mode 100644 20240627-BC-listapi.md diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md new file mode 100644 index 0000000..2bce582 --- /dev/null +++ b/20240627-BC-listapi.md @@ -0,0 +1,117 @@ +# List API for the Dapr state component + +This proposal proposes implementing a List API in Dapr's state component. The List API will enable the retrieval of keys in a state store based on certain criteria, providing users with the necessary visibility into stored keys. List API results will **not** include the value. SDKs could provide the possibility to bulk get all the keys returned in a page. + +The requirements for the API are: + +- Ability to list all keys in a state store +- Ability to list keys in a state store with a certain prefix +- The results can be sorted +- The results can be paginated + +As with the other state store APIs, the List API will also have the difficult job of finding a set of features that are supported across most state store components and filling in the gaps with reasonable behaviour when they aren’t. + +## API + +### HTTP + +Developers can list keys by issuing an HTTP API call to the Dapr sidecar: + +```bash +GET /v1.0/state/:storeName/?prefix={prefix}&sorting={sorting}&page_limit={pageLimit}&page_token={pageToken} +``` + +### gRPC + +Developers can also list keys by issuing a unary gRPC call + +```bash +service Dapr { + ... + rpc ListState(ListRequest) returns (ListResponse) {} + ... +} + +message ListStateRequest { + // The prefix that should be used for listing. + optional string prefix = 1; + + // The maximum number of items that should be returned per page + optional uint32 page_limit = 2; + + // Specifies if the result should be sorted + optional Sort sort = 3; + + // Specifies the next pagination token + // If this is empty, it indicates the start of a new listing. + optional string page_token = 4; + + // Sorting order options + enum Sort { + DEFAULT = 0; + ASCENDING = 1; + DESCENDING = 2; + } +} + +message ListStateResponse { + // The items that were listed + repeated string keys = 1; + + // The next pagination token that should be sent for the next page + // If this is empty, it indicates that there are no more pages. + optional string next_page_token = 2; +} +``` + +### Default values + +- Prefix: “” +- Sorting: “default” +- Page limit: 50 +- Next token: “” + +## **Pagination** + +The two most common pagination strategies are token and offset-based pagination. + +**Offset-based pagination** +Uses a fixed offset and limit to retrieve a subset of results from a larger dataset. This method is common in relational databases and is implemented with the `LIMIT` and `OFFSET` clauses. + +It’s not common in no-sql databases, but it is very common in SQL databases. It relies on a table scan and skipping results until it reaches the offset value. + +**Token-based pagination** + +Relies on a token usually equal to, or derived from the last element in the last returned page. + +Very common in no-sql databases that do a scan across the keyspace. + +In relational databases this method relies on an indexed column, such as a timestamp or an ID, to ensure efficient sorting and querying. For example: + +```bash +SELECT * FROM items WHERE key > last_key_id ORDER BY key; +``` + +--- + +Most often, offset-based pagination is not possible in no-sql databases, while it’s easy (even preferable) to implement in relational databases, so this proposal suggests using **token-based pagination** in the List API. + +Based on this decision, listing items will only be available forwards, and not backwards. To list previous pages, the application would have to keep track of the page tokens. + +## **Sorting** + +Sorting is required for token-based pagination in relational databases, so we must have a default sorting order. + +Some no-sql databases (ex. Azure blob store) don’t support sorting in descending order and others don’t support any sorting at all (ex. Redis). In these cases, we want to return an explicit error instead of failing silently. + +This might be restricting for use cases where the underlying state store needs to be swapped though. For example a team could use Redis for local development, and Postgres in production, and they wouldn’t be able to use the same application code, because the sorting clause would error on Redis, but pass on Postgres. That’s why we’re introducing the `Default` sorting option which will sort in ascending order for all databases that support it, and leave results unsorted for the databases that don’t. + +## SDKs + +All supported SDKs should be updated to implement the List API. SDKs should offer the option to fetch batch values of the returned keys. + +## Default behaviour for state stores with missing features + +Some of the state stores Dapr supports don’t provide the necessary capabilities for implementing the list API. For example, Memcached doesn’t provide a way to list keys, Azure table storage can’t sort keys in descending order and so on. For those cases the list API will do a best effort to provide the closest functionality to the one defined in the API. The functionality will be specific to the data store and will be implemented on the component level. + +List API requests on state stores that don’t support the List API will result in errors. \ No newline at end of file From 5a1cae5ec2f3fe2fcf2a402b6724168adebdd93b Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 27 Jun 2024 10:37:19 +0100 Subject: [PATCH 2/9] Adds definitions Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 2bce582..862cd58 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -114,4 +114,11 @@ All supported SDKs should be updated to implement the List API. SDKs should offe Some of the state stores Dapr supports don’t provide the necessary capabilities for implementing the list API. For example, Memcached doesn’t provide a way to list keys, Azure table storage can’t sort keys in descending order and so on. For those cases the list API will do a best effort to provide the closest functionality to the one defined in the API. The functionality will be specific to the data store and will be implemented on the component level. -List API requests on state stores that don’t support the List API will result in errors. \ No newline at end of file +List API requests on state stores that don’t support the List API will result in errors. + +## Definitions: + +- **Listing**: The ability to retrieve a collection of items. +- **Sorting**: The ability to sort the results based on one or more fields. +- **Prefix Search**: The ability to search for items that start with a given prefix. +- **Pagination**: The ability to paginate through items, typically using skip/limit or similar mechanisms. \ No newline at end of file From dc407ca0a503e967a75c69223522a9729a5a55eb Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 27 Jun 2024 14:21:37 +0100 Subject: [PATCH 3/9] Adds a paragraph on stable status impact Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 862cd58..8efd2d1 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -116,6 +116,10 @@ Some of the state stores Dapr supports don’t provide the necessary capabilitie List API requests on state stores that don’t support the List API will result in errors. +## Impact of the List API on Dapr state store components +From the moment this proposal is accepted, all state store components will be required to implement the List API in order to get the "Stable" certification level. +Components that are currently stable and for which the underlying state store does not support listing will not lose their stable status. + ## Definitions: - **Listing**: The ability to retrieve a collection of items. From df5edde0fc81091a0e8428bae5f5df33fc1a776b Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 27 Jun 2024 14:35:03 +0100 Subject: [PATCH 4/9] Adds examples Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 8efd2d1..701537e 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -21,6 +21,33 @@ Developers can list keys by issuing an HTTP API call to the Dapr sidecar: GET /v1.0/state/:storeName/?prefix={prefix}&sorting={sorting}&page_limit={pageLimit}&page_token={pageToken} ``` +The `sorting` query parameter can accept one of the following values: +- `default` +- `asc` +- `desc` + + +The response will be a JSON object with the following structure: +```json +{ + "keys": ["key1", "key2", "key3", "...", "keyN"], + "next_page_token": "nextTokenString" +} +``` + +For example: +Request: +```cURL +GET /v1.0/state/myStateStore?prefix=user&sorting=asc&page_limit=3&page_token=user3 +``` +Response: +```json +{ + "keys": ["user4", "user5", "user6"], + "next_page_token": "user6" +} +``` + ### gRPC Developers can also list keys by issuing a unary gRPC call @@ -120,6 +147,10 @@ List API requests on state stores that don’t support the List API will result From the moment this proposal is accepted, all state store components will be required to implement the List API in order to get the "Stable" certification level. Components that are currently stable and for which the underlying state store does not support listing will not lose their stable status. +## Performance and pricing implications +Listing keys in big data sets, specially for partitioned databases, can be expensive in terms of both performance and cost. Often it would incur creating an index which will impact write performance, storage cost and sometimes even read performance. +For the databases where this is a concern, we should offer an option to disable the List API on the component level. + ## Definitions: - **Listing**: The ability to retrieve a collection of items. From 6e2989822b3a1f3497f64c9dd666dd465dee0bdd Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 24 Sep 2024 15:05:22 +0100 Subject: [PATCH 5/9] Adds the feature comparison table Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 701537e..8313355 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -156,4 +156,27 @@ For the databases where this is a concern, we should offer an option to disable - **Listing**: The ability to retrieve a collection of items. - **Sorting**: The ability to sort the results based on one or more fields. - **Prefix Search**: The ability to search for items that start with a given prefix. -- **Pagination**: The ability to paginate through items, typically using skip/limit or similar mechanisms. \ No newline at end of file +- **Pagination**: The ability to paginate through items, typically using skip/limit or similar mechanisms. + +## Adendum + +Here's a list of the relevant capabilities of all the stable state stores: + +|   | Store | Cursor listing | Offset listing | Sorting | Number of Items per Page | Prefix Search | Comments | +| --- | --- | --- | --- | --- | --- | --- | --- | +| 1 | aws dynamodb | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | +| 2 | azure blob store | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | +| 3 | azure cosmos db | Yes | Yes | Yes | Yes | Yes |   | +| 4 | azure table storage | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | +| 5 | cassandraYes | No | No | No | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. |   | +| 6 | cockroachdbYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the search column |   | +| 7 | gcp firestore | Yes |   |   |   |   |   | +| 8 | in-memory | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | +| 9 | memcached | No | No | No | No | No |   | +| 10 | mongodbYes | Yes | Yes |   | Yes |   |   | +| 11 | mysqlYes | Yes |   |   | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. |   | +| 12 | postgresqlYes | Yes | Yes | No | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. |   | +| 13 | redisYes | No | No | No | Yes | Number of record per page is not guaranteed, but best effort. |   | +| We could maintain our own sortedset that keeps the ttl. |   |   |   |   |   |   |   | +| 14 | sqliteYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash |   | +| 15 | sqlserver | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | From 529e3020e7c8823a64ebd3413a8303fd46737bc9 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 24 Sep 2024 15:08:12 +0100 Subject: [PATCH 6/9] remove unneeded column to maximise space Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 8313355..54a6b87 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -162,21 +162,21 @@ For the databases where this is a concern, we should offer an option to disable Here's a list of the relevant capabilities of all the stable state stores: -|   | Store | Cursor listing | Offset listing | Sorting | Number of Items per Page | Prefix Search | Comments | -| --- | --- | --- | --- | --- | --- | --- | --- | -| 1 | aws dynamodb | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | -| 2 | azure blob store | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | -| 3 | azure cosmos db | Yes | Yes | Yes | Yes | Yes |   | -| 4 | azure table storage | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | -| 5 | cassandraYes | No | No | No | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. |   | -| 6 | cockroachdbYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the search column |   | -| 7 | gcp firestore | Yes |   |   |   |   |   | -| 8 | in-memory | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | -| 9 | memcached | No | No | No | No | No |   | -| 10 | mongodbYes | Yes | Yes |   | Yes |   |   | -| 11 | mysqlYes | Yes |   |   | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. |   | -| 12 | postgresqlYes | Yes | Yes | No | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. |   | -| 13 | redisYes | No | No | No | Yes | Number of record per page is not guaranteed, but best effort. |   | -| We could maintain our own sortedset that keeps the ttl. |   |   |   |   |   |   |   | -| 14 | sqliteYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash |   | -| 15 | sqlserver | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | +| Store | Cursor listing | Offset listing | Sorting | Number of Items per Page | Prefix Search | Comments | +| --- | --- | --- | --- | --- | --- | --- | +| aws dynamodb | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | +| azure blob store | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | +| azure cosmos db | Yes | Yes | Yes | Yes | Yes |   | +| azure table storage | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | +| cassandraYes | No | No | No | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. |   | +| cockroachdbYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the search column |   | +| gcp firestore | Yes |   |   |   |   |   | +| in-memory | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | +| memcached | No | No | No | No | No |   | +| mongodbYes | Yes | Yes |   | Yes |   |   | +| mysqlYes | Yes |   |   | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. |   | +| postgresqlYes | Yes | Yes | No | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. |   | +| redisYes | No | No | No | Yes | Number of record per page is not guaranteed, but best effort. |   | +|   |   |   |   |   |   |   | +| sqliteYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash |   | +| sqlserver | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | From ab04f50e996742e8bdd2843ce70888fe48d6cd74 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 24 Sep 2024 15:12:23 +0100 Subject: [PATCH 7/9] fixes table Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 54a6b87..0f41f62 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -164,19 +164,19 @@ Here's a list of the relevant capabilities of all the stable state stores: | Store | Cursor listing | Offset listing | Sorting | Number of Items per Page | Prefix Search | Comments | | --- | --- | --- | --- | --- | --- | --- | -| aws dynamodb | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | -| azure blob store | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | -| azure cosmos db | Yes | Yes | Yes | Yes | Yes |   | -| azure table storage | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | -| cassandraYes | No | No | No | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. |   | -| cockroachdbYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the search column |   | -| gcp firestore | Yes |   |   |   |   |   | -| in-memory | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | -| memcached | No | No | No | No | No |   | -| mongodbYes | Yes | Yes |   | Yes |   |   | -| mysqlYes | Yes |   |   | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. |   | -| postgresqlYes | Yes | Yes | No | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. |   | -| redisYes | No | No | No | Yes | Number of record per page is not guaranteed, but best effort. |   | +| **aws dynamodb** | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | +| **azure blob store** | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | +| **azure cosmos db** | Yes | Yes | Yes | Yes | Yes |   | +| **azure table storage** | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | +| **cassandra** | Yes | No | No | Yes | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. | +| **cockroachdb** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the search column | +| **gcp firestore** | Yes |   |   |   |   |   | +| **in-memory** | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | +| **memcached** | No | No | No | No | No |   | +| **mongodb** | Yes | Yes | Yes | Yes | Yes |   | +| **mysql** | Yes | Yes |   | Yes | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. | +| **postgresql** | Yes | Yes | Yes | Yes | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. | +| **redis** | Yes | No | No | Yes (Best effort) | Yes | Number of record per page is not guaranteed, but best effort. | |   |   |   |   |   |   |   | -| sqliteYes, if sorting is required | Yes | Yes | No | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash |   | -| sqlserver | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | +| **sqlite** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash | +| **sqlserver** | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | From 648440d43360eb4e4181f6602be68aad8b9b96e5 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 8 Oct 2024 11:22:07 +0100 Subject: [PATCH 8/9] Typos Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index 0f41f62..a13961a 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -174,9 +174,8 @@ Here's a list of the relevant capabilities of all the stable state stores: | **in-memory** | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | | **memcached** | No | No | No | No | No |   | | **mongodb** | Yes | Yes | Yes | Yes | Yes |   | -| **mysql** | Yes | Yes |   | Yes | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. | +| **mysql** | Yes | Yes | Yes | Yes | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. | | **postgresql** | Yes | Yes | Yes | Yes | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. | | **redis** | Yes | No | No | Yes (Best effort) | Yes | Number of record per page is not guaranteed, but best effort. | -|   |   |   |   |   |   |   | | **sqlite** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash | | **sqlserver** | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column | From ce3199a829d815aadc5f3f105c20d6d1da9ef6ac Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Wed, 9 Oct 2024 12:11:06 +0100 Subject: [PATCH 9/9] Removes sorting Signed-off-by: Elena Kolevska --- 20240627-BC-listapi.md | 44 ++++++++++++------------------------------ 1 file changed, 12 insertions(+), 32 deletions(-) diff --git a/20240627-BC-listapi.md b/20240627-BC-listapi.md index a13961a..3c0bc54 100644 --- a/20240627-BC-listapi.md +++ b/20240627-BC-listapi.md @@ -6,9 +6,12 @@ The requirements for the API are: - Ability to list all keys in a state store - Ability to list keys in a state store with a certain prefix -- The results can be sorted - The results can be paginated +Not required: +- Ability to sort keys +- Ability to return a paginated result of a snapshot of the state store at a certain point in time + As with the other state store APIs, the List API will also have the difficult job of finding a set of features that are supported across most state store components and filling in the gaps with reasonable behaviour when they aren’t. ## API @@ -18,14 +21,9 @@ As with the other state store APIs, the List API will also have the difficult jo Developers can list keys by issuing an HTTP API call to the Dapr sidecar: ```bash -GET /v1.0/state/:storeName/?prefix={prefix}&sorting={sorting}&page_limit={pageLimit}&page_token={pageToken} +GET /v1.0/state/:storeName/?prefix={prefix}&page_limit={pageLimit}&page_token={pageToken} ``` -The `sorting` query parameter can accept one of the following values: -- `default` -- `asc` -- `desc` - The response will be a JSON object with the following structure: ```json @@ -38,7 +36,7 @@ The response will be a JSON object with the following structure: For example: Request: ```cURL -GET /v1.0/state/myStateStore?prefix=user&sorting=asc&page_limit=3&page_token=user3 +GET /v1.0/state/myStateStore?prefix=user&page_limit=3&page_token=user3 ``` Response: ```json @@ -66,19 +64,10 @@ message ListStateRequest { // The maximum number of items that should be returned per page optional uint32 page_limit = 2; - // Specifies if the result should be sorted - optional Sort sort = 3; - // Specifies the next pagination token // If this is empty, it indicates the start of a new listing. optional string page_token = 4; - // Sorting order options - enum Sort { - DEFAULT = 0; - ASCENDING = 1; - DESCENDING = 2; - } } message ListStateResponse { @@ -94,7 +83,6 @@ message ListStateResponse { ### Default values - Prefix: “” -- Sorting: “default” - Page limit: 50 - Next token: “” @@ -121,18 +109,10 @@ SELECT * FROM items WHERE key > last_key_id ORDER BY key; --- -Most often, offset-based pagination is not possible in no-sql databases, while it’s easy (even preferable) to implement in relational databases, so this proposal suggests using **token-based pagination** in the List API. +Most often, offset-based pagination is not possible in no-sql databases, while token-based pagination is easy (even preferable) to implement in relational databases, so this proposal suggests using **token-based pagination** in the List API. Based on this decision, listing items will only be available forwards, and not backwards. To list previous pages, the application would have to keep track of the page tokens. -## **Sorting** - -Sorting is required for token-based pagination in relational databases, so we must have a default sorting order. - -Some no-sql databases (ex. Azure blob store) don’t support sorting in descending order and others don’t support any sorting at all (ex. Redis). In these cases, we want to return an explicit error instead of failing silently. - -This might be restricting for use cases where the underlying state store needs to be swapped though. For example a team could use Redis for local development, and Postgres in production, and they wouldn’t be able to use the same application code, because the sorting clause would error on Redis, but pass on Postgres. That’s why we’re introducing the `Default` sorting option which will sort in ascending order for all databases that support it, and leave results unsorted for the databases that don’t. - ## SDKs All supported SDKs should be updated to implement the List API. SDKs should offer the option to fetch batch values of the returned keys. @@ -158,24 +138,24 @@ For the databases where this is a concern, we should offer an option to disable - **Prefix Search**: The ability to search for items that start with a given prefix. - **Pagination**: The ability to paginate through items, typically using skip/limit or similar mechanisms. -## Adendum +## Addendum Here's a list of the relevant capabilities of all the stable state stores: | Store | Cursor listing | Offset listing | Sorting | Number of Items per Page | Prefix Search | Comments | | --- | --- | --- | --- | --- | --- | --- | | **aws dynamodb** | Yes | No | Yes, with a GSI | Yes | Yes, with an additional sortKey and a GSI | In order to be able to use prefix search, users will need to have a Global Search Index(GSI) where the partition key will be a single fixed string (for ex. the `_` character) and the sort key will be the key name. There are some drawbacks to this that can be discussed in detail elsewhere. | -| **azure blob store** | Yes (continuation token) | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | +| **azure blob store** | Yes | No | Always sorted in ASC order. Desc, or unsorted is not possible. | Yes | Yes | Results are always sorted by key name in ascending order. | | **azure cosmos db** | Yes | Yes | Yes | Yes | Yes |   | | **azure table storage** | Yes | No | Yes, just ASC | Yes, with $top | Yes, with range search | Partition key is the application id. | -| **cassandra** | Yes | No | No | Yes | No | Can’t prefix search and sort across all partitions. We could consider maintaining a new table containing all keys, and mirroring the original key’s ttl. | +| **cassandra** | Yes | No | No | Yes | Yes, with `ALLOW FILTERING` but it does a full table scan. | Prefix search with `ALLOW FILTERING` is prohibitively slow. Sorting doesn't happen across all partitions. We could consider maintaining a new index-like non-partitioned table containing all keys, and mirroring the original key’s ttl. | | **cockroachdb** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the search column | | **gcp firestore** | Yes |   |   |   |   |   | -| **in-memory** | No | No | No | No | No | We can implement all the features, but it’s not trivial to aggregate data across multiple instances | +| **in-memory** | No | No | No | No | No | All features can be implemented | | **memcached** | No | No | No | No | No |   | | **mongodb** | Yes | Yes | Yes | Yes | Yes |   | | **mysql** | Yes | Yes | Yes | Yes | Yes | Need to create an index on the id column. MySql supports specialized prefix indices, but you would have to know the exact length of the prefix you’ll be searching on, also sorting will not use the index. | -| **postgresql** | Yes | Yes | Yes | Yes | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. | +| **postgresql** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the key column. We can use the varchar\_pattern\_ops operator class, optimised for prefix search. | | **redis** | Yes | No | No | Yes (Best effort) | Yes | Number of record per page is not guaranteed, but best effort. | | **sqlite** | Yes, if sorting is required | Yes | Yes | Yes | Yes | Need to create an index on the key column. It’s a standard b-tree index.We could maintain an index of all keys in a hash | | **sqlserver** | Yes, if sorting is required | Yes | Yes | Yes | Yes | need to create a non-clustered index on the “key” column |