You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* escape utilities; wip unit tests thereof
* unit tests for the escape/unescape utilities
* added more tests of doc-path manip
* distinct methods accept list of literals and escape str input. WIP int.tests todo
* distinct: enriched int. tests with full coverage of list key inputs
* path segments in utilities can be integers
* unescape of empty string returns empty list
* wip for maps-as-tuples: got the basics working
* basic support for tuples in table payloads (no actual test in filters and some updates)
* align un/escape utilities to latest conventions
* finalize distinct docstrings
* disabled auto-tuples for any 'filter' portion of any payload
* cleanup of map2tuple_paths for updateOne
* changesfile
* update docstring for collection find/one about include_similarity
* wip for serders+generalized
* serdes option to control auto-tuple behaviour, added specific tests to tuplification
* refactoring of map2tuple path definition + unit test thereof
* full onboarding of the map2tuple serdes option
* refactor map2tuple logic to use discriminator functions
* basic support for column indexes (with 'entries', no testing)
* adjust against 1906 as temp measure
* completed 'inert' work on listindexes, parking as is for now
* fix some unit tests
* full empty-options-hiding in table index definition as_dict + adjust tests
* createCollection, FARR ready with unit tests
* rename enum to CursorState
* cursors, clone method(s) retain the mapping
* preparation for farr prototype
* FindAndRerankCursor classes
* remove the 'alive' (property) method of all cursors
* collection, find_and_rerank method impl
* changesfile
* restore FARR signature to resemble find; rename FARR's sort type alias to HybridSortType
* RerankedResult for FARR cursors
* changesfile
* fix return types of find_and_rerank's
* removed CumulativeOperationException
* insertmany overhaul WIP: collectionInsertMany done
* insertmany overhaul WIP: collectionInsertMany+TableInsertMany done
* all batch-op exceptions behave as per 2.0 spec now
* str repr for bulk op exception classes
* a type alias for DataAPIWarningDescriptor
* wip on revise docstrings and tests
* done last remaining todos in docstrings + note on escaping utils in readme
* adjust tests to new bulk-exception logic
* adjust integration testing to new bulk-exception structure
* exception diagram picture and changesfile done
* fix bulk-exceptions integration tests for dml tests
* all integration tests adapted to new bulk-error structure
* adjust all tests to the exception rework
* map-to-tuple serdes option has three states (never, DataAPIMaps, always)
* farr method docstring (colls); adjust tests for the default lexical coming back
* added missing docstrings
* update import sample code: readme+test
* refactor cursor modules
* renamed (table) index type 'text-analysed' => 'text'
* classes for findRerProvs response + unit tests
* async_/find_reranking_providers method in database admins
* reranking header provider classes
* api options is reranking-api-key aware
* adapt unit tests to reranking_api_keys param
* adapt to API renaming reranking->rerank in createCollection
* collection lifecycle int.tests for FARR
* added RerankingAPIKeyHeaderProvider code example to docstring
* add unit test of rerank-api-key in commander headers
* rename CollectionRerankingOptions & RerankingServiceOptions ==> CollectionRerankOptions & RerankServiceOptions
* improve farr mock response for testing set-up
* protect cursors' dict inputs with deepcopy
* mock-based basic 'integration tests' for findAndRerank
* thorough testing of the bulk exceptions
* include_sort_vector in findAndRerank support
* integration tests for get_sort_vector in collections' FARR
* include_scores in collection's FARR
* First farr IT with actual API; protection against 'vector' score; un-mock farr responses
* remove outdated ref to cursor.distinct removed method from docstrings
* farr cursors, completed docstrings for methods and classes
* standard cursor test for sync collection FARR cursor
* async farr cursor test
* adjusted a test to a newly-effective error deduplication in insertMany
* some missing parts from README on new features
* changesfile
* Eric P's spotted a typo in authentication.py docstring example
* change header name for reranking API key; restore empty-farr-results test
* remove hybridLimits from FARR tests when not needed as it became optional for the API
* add collection support for DataAPIDate and DataAPIMap + tests
* changesfile
* test of zero-match FARR cursors and their get_sort_vector properties
* extend vectorize/FARR sort and hybrid_limits testing to objects
* introduced rerankQuery param for FARR (for byov querying)
* adapt farr novectorize tests to rerank_query parameter; temporary fix against issue #1949
* find_and_rerank, detailed code examples in docstring
* docstrings
* removed mocker of farr responses
* comments
* adapt testing to dev/prod/local provisional differences re: hybrid and farr
* Ustilaginales
* more protection against issue 1949
* readme
* add DataAPIVector to HybridSortType; adjust astra admin tests to get_database requiring region
* mark find_and_rerank as beta method
* testing of typing with FARR
* final changesfile
Copy file name to clipboardExpand all lines: CHANGES
+43-5Lines changed: 43 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,43 @@
1
-
main
2
-
====
3
-
Exceptions
4
-
- `DataAPIResponseException.error_descriptors` is a property (computed from detailed_error_descriptors)
1
+
v 2.0.0
2
+
==========
3
+
Collections admit `DataAPIMap` in writing and support `DataAPIDate` as well (the latter coming with the same timezone caveats as `datetime.date`)
4
+
(plus all of the v2 pre-releases below)
5
+
6
+
7
+
v 2.0.0rc2
8
+
==========
9
+
Cursors (class hierarchy revised to accommodate `find_and_rerank`, plus other changes):
10
+
- renamed the 'FindCursorState' enum to `CursorState`
11
+
- renamed the abstract ur-class 'FindCursor' => `AbstractCursor`
12
+
- `clone` method does not strip the mapping anymore, rather retains it
13
+
- removed the `alive` sugar property (use `cursors.state != CursorState.CLOSED`)
14
+
Support for reranker header-based authentication:
15
+
- new authentication classes `RerankingHeadersProvider`, `RerankingAPIKeyHeaderProvider`
16
+
- introduced `reranking_api_key` parameter for APIOptions, `{get|create}_{table|collection}` database methods, collection/table `with_options` and `to_[a]sync` methods
17
+
Support for "findRerankingProviders" API command in Database Admin classes:
18
+
- classes class hierarchy: `RerankingProviderParameter`, `RerankingProviderModel`, `RerankingProviderToken`, `RerankingProviderAuthentication`, `RerankingProvider`, `FindRerankingProvidersResult` to express the response
- findAndRerank cursors return the new `RerankedResult` construct by default (modulo custom mappings)
26
+
Maps for tables expressed as list of pairs (association lists):
27
+
- support for automatic handling of DataAPIMaps (+possibly dicts) in the proper table payload portions
28
+
- introduced serdes option `encode_maps_as_lists_in_tables` (default to "NEVER") to control this
29
+
Exceptions, major rework of `[Table|Collection]InsertManyException`, `CollectionUpdateManyException` and `CollectionDeleteManyException` ("bulk operations")
30
+
- All astrapy exceptions derive directly from `Exception` (and not 'ValueError` anymore)
5
31
- better string representation of `DataAPIDetailedErrorDescriptor`
32
+
- `DataAPIDetailedErrorDescriptor` removed.
33
+
- 'CumulativeOperationException` removed.
34
+
- The four `CollectionInsertManyException`, `CollectionUpdateManyException`, `CollectionDeleteManyException`, `TableInsertManyException` classes now inherit directly from `DataAPIResponseException`.
35
+
- New semantics and structure for `[Collection|Table]InsertManyException`: they have members `inserted_ids`(/`inserted_id_tuples`) and an `exceptions` list for the root cause(s)
36
+
- New semantics and structure for `Collection[Update|Delete]ManyException`: they have members `partial_result` and a single-exception `cause`. They are now raised consistently for API exceptions occurring during the respective methods.
- Package on [PyPI](https://pypi.org/project/astrapy/)
95
95
96
+
### Server-side embeddings
97
+
98
+
AstraPy works with the "vectorize" feature of the Data API. This means that one can define server-side computation for vector embeddings and use text strings in place of a document vector, both in writing and in reading.
99
+
The transformation of said text into an embedding is handled by the Data API, using a provider and model you specify.
100
+
101
+
```python
102
+
my_collection = database.create_collection(
103
+
"my_vectorize_collection",
104
+
definition=(
105
+
CollectionDefinition.builder()
106
+
.set_vector_service(
107
+
provider="example_vendor",
108
+
model_name="embedding_model_name",
109
+
authentication={"providerKey": "<STORED_API_KEY_NAME>"} # if needed
110
+
)
111
+
.build()
112
+
)
113
+
)
114
+
115
+
my_collection.insert_one({"$vectorize": "text to make into embedding"})
See the [Data API reference](https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html)
121
+
for more on this topic.
122
+
123
+
### Hybrid search
124
+
125
+
AstraPy supports the supports the "find and rerank" Data API command,
126
+
which performs a hybrid search by combining results from a lexical search
127
+
and a vector-based search in a single operation.
128
+
129
+
```python
130
+
r_results = my_collection.find_and_rerank(
131
+
sort={"$hybrid": "query text"},
132
+
limit=10,
133
+
include_scores=True,
134
+
)
135
+
136
+
for r_result in r_results:
137
+
print(r_result.document, r_results.scores)
138
+
```
139
+
140
+
The Data API must support the primitive (and one must not have
141
+
disabled the feature at collection-creation time).
142
+
143
+
See the Data API reference, and the docstring for the `find_and_rerank` method,
144
+
for more on this topic.
145
+
96
146
### Using Tables
97
147
98
148
The example above uses a _collection_, where schemaless "documents" can be stored and retrieved.
@@ -184,7 +234,17 @@ for result in cursor:
184
234
my_table.drop()
185
235
```
186
236
187
-
For more on Tables, consult the [Data API documentation about Tables](https://docs.datastax.com/en/astra-db-serverless/api-reference/tables.html).
237
+
For more on Tables, consult the [Data API documentation about Tables](https://docs.datastax.com/en/astra-db-serverless/api-reference/tables.html). Note that most features of Collections, with due modifications, hold for Tables as well (e.g. "vectorize", i.e. server-side embeddings).
238
+
239
+
#### Maps as association lists
240
+
241
+
In the Data API, table `map` columns with key of a type other than text
242
+
have to be expressed as association lists,
243
+
i.e. nested lists of lists: `[[key1, value1], [key2, value2], ...]`.
244
+
245
+
AstraPy objects can be configured to always do so automatically, for a seamless
246
+
experience.
247
+
See the API Option `serdes_options.encode_maps_as_lists_in_tables` for details.
188
248
189
249
### Usage with HCD and other non-Astra installations
190
250
@@ -393,6 +453,25 @@ my_collection.update_one(
393
453
my_collection.insert_one({"_id": uuid8()})
394
454
```
395
455
456
+
### Escaping field names
457
+
458
+
Field names containing special characters (`.` and `&`) must be correctly escaped
459
+
in certain Data API commands. It is a responsibility of the user to ensure escaping
460
+
is done when needed; however, AstraPy offers utilities to escape sequences of "path
461
+
segments" and -- should it ever be needed -- unescape path-strings back into
462
+
literal segments:
463
+
464
+
```python
465
+
from astrapy.utils.document_paths import escape_field_names, unescape_field_path
0 commit comments