Skip to content

Conversation

@h4l
Copy link
Owner

@h4l h4l commented Dec 24, 2024

This PR implements support for writing to KV databases, using KV Connect / datapath protocol's atomic_write operation. It implements support for both the raw operation in denokv.datapath.atomic_write(), and high level APIs on the Kv type:

  • Kv.set(), Kv.delete(), Kv.sum(), Kv.min(), Kv.max(), Kv.enqueue() and Kv.check().
  • These methods are available on Kv itself for one-off operations, and
    Kv.atomic() can chain these methods to group write operations to apply together in a transaction.

@h4l h4l force-pushed the kv-write branch 3 times, most recently from a19e7f6 to 684f188 Compare February 4, 2025 03:51
h4l added 26 commits February 4, 2025 03:53
The dev service can be used for ad-hoc dev tasks in a specific Python
version environment. e.g:

    docker compose run --rm dev-py39
Python versions before 3.11 don't allow TypedDict subclasses to also
inherit from Generic at runtime.
We'll use this instead of ellipsis to detect arguments that are not set
in a call. This is needed to distinguish overloads in some cases.
We could get by without a functioning mock of the DB, but I find it
valuable to implement it to make sure I'm not missing something about
the possible functionality. Plus it'll make testing the Kv API easier.
This is the low-level API to invoke Atomic Write requests without any
hand-holding.
It makes Enums use 'EnumName.FIELD' as the repr rather than the default
syntax with < ... >.
KvEntry, VersionStamp and KvU64 are now in denokv._kv_values. These
types are needed by other submodules that I'd like to separate from the
denokv.kv module.
PlannedWrite is the central type, it represents the collection of
mutation operations to be applied to the DB in a single atomic write.

We have a KvWriter protocol which PlannedWrite implements, it'll be used
by Kv so that we don't have a circular dependency between denokv.kv.Kv
and denokv._kv_writes.PlannedWrite.

In addition, we now have a ProtobufMessageRepresentation[MessageT] type
which can be used to generalise the types representing datapath protocol
types, like Sum, Max, Min, etc; to avoid coupling PlannedWrite to our
default implementations.
We need a slightly customised V8 serialization format encoder to handle
int and float consistently in write sum/min/max operations.
This allows making atomic_write datapath requests to make changes to the
database, by translating PlannedWrite (& related types) into AtomicWrite
protobuf messages.
We need to use it in both test_kv and test_datapath.
Now in denokv_testing so use it in test_kv.
It's now make_database_metadata() and a separate meta_endpoint() fn to
unpack it. We'll use make_database_metadata() to replace the similar and
overlapping mk_db_meta() function.
make_database_metadata() covers its use-case now.
We were mocking the snapshot_read() function to return results from
the mock db implementation, but now we access the mock db via the
unmocked datapath module's HTTP requests to an aiohttp mock HTTP server.
We now use a customised V8 decoder for the MockKvDb API that mirrors the
default encoder used by Kv (that encodes int as BigInt): it always
decodes int32 values as float rather than the v8serialize default of
decoding int32 as int and Double as float.

We don't rely on this being in place as we currently encode all int32
range values as Double (which are always decoded as float), but it seems
prudent to ensure that we don't mistakenly treat small ints as BigInt
instead of Number.
CommittedWrite satisfies is_ok() and ConflictedWrite satisfies is_err().
The ResponseUnsuccessful exception held details of the error message
from the db server, but was not showing it in the default string
representation; now it does.
DenoKvError was extending BaseException instead of Exception, which
meant that catch-all Exception handlers didn't catch it.
We now provide a custom assertion failure message for comparisons
involving two protobuf Message objects. The property values are diffed,
similar to how pytest diffs dicts, etc.
We'll add __notes__ to exceptions in older versions, but in practice
they won't be visible in 3.9's tracebacks.

Based on: h4l/v8serialize@dec9d82
h4l added 20 commits February 4, 2025 03:54
It disables setattr and delattr, like @DataClass(frozen=True)
mypy thought it was Any.
MockKvDb's atomic sum/min/max operations now allow int/float to be used
for float sum operations. Previously sum operations had to use Python
float values, because int values were used to distinguish bigint values.

Now that v8serialize uses JSBigInt for bigint and decodes int-safe
floats as int, it's necessary to allow int in float operations.

We now segregate bigint/float/u64 number types according to the datapath
protobuf encoding rather than Python type, which avoids clashes between
bigint and float using Python int.
The Mapping needs overloads as its params are invariant.
The PlannedWrite and Mutation types now have much better support and
APIs for performing atomic number operations, like sum, min and max.

The first main feature is that the 3 number types (bigint, u64, float) can
be selected explicitly with a number_type parameter, or implicitly via
the type of the number value, and both options are type-safe. The second
is that we support min/max mutations on bigint and float types, not just
u64 (KvU64), and we support sum on KvU64 with both wrapping and clamp
limits. The underlying datapath protocol (and Deno's own KV API) do not
support these. This makes the atomic number APIs more intuitive, as
otherwise the user needs to know about the arbitrary restriction that
only certain number types can use Min/Max/Sum.

I don't love the massive amount of @overload annotations it takes to
type-annotate these functions well, but I don't think there's a better
alternative.

The way that PlannedWrite interfaces with the main Kv class is now
decoupled with a general interface, so Kv and PlannedWrite do not have
explicit dependencies on each other. This makes it easier to test and
modularise the codebase, and also allows alternative writer
implementations to be created for special cases.
The set()/sum()/delete() etc methods of PlannedWrite are now implemented
in individual mixin classes. This will allow us to implement the same
set of methods on the Kv class, without duplicating all the horrific
@overload annotations.
It used to always require a message string, which made it awkward to
subclass for errors that don't want to define a message at init.
Rather than throwing the lower-level errors from datapath, we'll now use
one of these two higher-level errors.
Checks for unbound variable use.
The denokv self-hosted implementation does not return indexes of failed
checks, so we need to allow this. The spec does not say that servers
MUST report indexes of failed checks, only that clients SHOULD report
failed indexes. See: denoland/denokv#110
CommittedWrite, ConflictedWrite and FailedWrite now all have a
has_unknown_conflicts property to indicate whether the
conflicts/failed_checks is populated with the specific checks that
failed.

This is necessary because databases don't have to report which checks
failed.
Kv now has the check(), set(), sum(), min(), max(), delete() and
enqueue() methods of PlannedWrite, using the same mixins that define
these methods on PlannedWrite. On Kv they execute immediately instead of
accumulating operations to execute later.
Python 3.9's datetime.fromisoformat() does not parse datetimes with a
UTC Z indicator.
Some mixin/interface types did not have empty __slots__ defined, which
caused types using slots to still have dicts.
We now use ruff/flake8-tidy-imports to ban importing
typing/typing_extensions. Unfortunately it doesn't seem to support
banning everything except allow-listed items, so we have to ignore
places that need to import Literal and overload from typing (due to ruff
mis-handling them when they're imported from our typing module).
@h4l h4l merged commit 9e18b91 into main Feb 4, 2025
23 checks passed
@h4l h4l changed the title wip: implement write operations Implement write operations Feb 4, 2025
@h4l h4l deleted the kv-write branch February 4, 2025 06:03
@h4l h4l deployed to release February 4, 2025 06:04 — with GitHub Actions Active
@h4l h4l mentioned this pull request Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants