Skip to content

Commit 2c3b6cb

Browse files
committed
deserialize: move mod-level docs to new README.md
This commit moves the mod-level documentation from `scylla-cql/src/deserialize/mod.rs` to a new `README.md` file in the `scylla-cql/src/deserialize/` directory. This is to allow `scylla` crate use the same documentation without duplicating it.
1 parent 498384b commit 2c3b6cb

File tree

3 files changed

+214
-212
lines changed

3 files changed

+214
-212
lines changed
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
Framework for deserialization of data returned by database queries.
2+
3+
Deserialization is based on two traits:
4+
5+
- A type that implements `DeserializeValue<'frame, 'metadata>` can be deserialized
6+
from a single _CQL value_ - i.e. an element of a row in the query result,
7+
- A type that implements `DeserializeRow<'frame, 'metadata>` can be deserialized
8+
from a single _row_ of a query result.
9+
10+
Those traits are quite similar to each other, both in the idea behind them
11+
and the interface that they expose.
12+
13+
It's important to understand what is a _deserialized type_. It's not just
14+
an implementor of Deserialize{Value, Row}; there are some implementors of
15+
`Deserialize{Value, Row}` who are not yet final types, but **partially**
16+
deserialized types that support further deserialization - _type
17+
deserializers_, such as `ListlikeIterator`, `UdtIterator` or `ColumnIterator`.
18+
19+
# Lifetime parameters
20+
21+
- `'frame` is the lifetime of the frame. Any deserialized type that is going to borrow
22+
from the frame must have its lifetime bound by `'frame`.
23+
- `'metadata` is the lifetime of the result metadata. As result metadata is only needed
24+
for the very deserialization process and the **final** deserialized types (i.e. those
25+
that are not going to deserialize anything else, opposite of e.g. `MapIterator`) can
26+
later live independently of the metadata, this is different from `'frame`.
27+
28+
_Type deserializers_, as they still need to deserialize some type, are naturally bound
29+
by 'metadata lifetime. However, final types are completely deserialized, so they should
30+
not be bound by 'metadata - only by 'frame.
31+
32+
Rationale:
33+
`DeserializeValue` requires two types of data in order to perform
34+
deserialization:
35+
1) a reference to the CQL frame (a FrameSlice),
36+
2) the type of the column being deserialized, being part of the
37+
ResultMetadata.
38+
39+
Similarly, `DeserializeRow` requires two types of data in order to
40+
perform deserialization:
41+
1) a reference to the CQL frame (a FrameSlice),
42+
2) a slice of specifications of all columns in the row, being part of
43+
the ResultMetadata.
44+
45+
When deserializing owned types, both the frame and the metadata can have
46+
any lifetime and it's not important. When deserializing borrowed types,
47+
however, they borrow from the frame, so their lifetime must necessarily
48+
be bound by the lifetime of the frame. Metadata is only needed for the
49+
deserialization, so its lifetime does not abstractly bound the
50+
deserialized value. Not to unnecessarily shorten the deserialized
51+
values' lifetime to the metadata's lifetime (due to unification of
52+
metadata's and frame's lifetime in value deserializers), a separate
53+
lifetime parameter is introduced for result metadata: `'metadata`.
54+
55+
# `type_check` and `deserialize`
56+
57+
The deserialization process is divided into two parts: type checking and
58+
actual deserialization, represented by `DeserializeValue`/`DeserializeRow`'s
59+
methods called `type_check` and `deserialize`.
60+
61+
The `deserialize` method can assume that `type_check` was called before, so
62+
it doesn't have to verify the type again. This can be a performance gain
63+
when deserializing query results with multiple rows: as each row in a result
64+
has the same type, it is only necessary to call `type_check` once for the
65+
whole result and then `deserialize` for each row.
66+
67+
Note that `deserialize` is not an `unsafe` method - although you can be
68+
sure that the driver will call `type_check` before `deserialize`, you
69+
shouldn't do unsafe things based on this assumption.
70+
71+
# Data ownership
72+
73+
Some CQL types can be easily consumed while still partially serialized.
74+
For example, types like `blob` or `text` can be just represented with
75+
`&[u8]` and `&str` that just point to a part of the serialized response.
76+
This is more efficient than using `Vec<u8>` or `String` because it avoids
77+
an allocation and a copy, however it is less convenient because those types
78+
are bound with a lifetime.
79+
80+
The framework supports types that refer to the serialized response's memory
81+
in three different ways:
82+
83+
## Owned types
84+
85+
Some types don't borrow anything and fully own their data, e.g. `i32` or
86+
`String`. They aren't constrained by any lifetime and should implement
87+
the respective trait for _all_ lifetimes, i.e.:
88+
89+
```rust
90+
# use scylla_cql::frame::response::result::{NativeType, ColumnType};
91+
# use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
92+
# use scylla_cql::deserialize::value::DeserializeValue;
93+
use thiserror::Error;
94+
struct MyVec(Vec<u8>);
95+
#[derive(Debug, Error)]
96+
enum MyDeserError {
97+
#[error("Expected bytes")]
98+
ExpectedBytes,
99+
#[error("Expected non-null")]
100+
ExpectedNonNull,
101+
}
102+
impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyVec {
103+
fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
104+
if let ColumnType::Native(NativeType::Blob) = typ {
105+
return Ok(());
106+
}
107+
Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
108+
}
109+
110+
fn deserialize(
111+
_typ: &'metadata ColumnType<'metadata>,
112+
v: Option<FrameSlice<'frame>>,
113+
) -> Result<Self, DeserializationError> {
114+
v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
115+
.map(|v| Self(v.as_slice().to_vec()))
116+
}
117+
}
118+
```
119+
120+
## Borrowing types
121+
122+
Some types do not fully contain their data but rather will point to some
123+
bytes in the serialized response, e.g. `&str` or `&[u8]`. Those types will
124+
usually contain a lifetime in their definition. In order to properly
125+
implement `DeserializeValue` or `DeserializeRow` for such a type, the `impl`
126+
should still have a generic lifetime parameter, but the lifetimes from the
127+
type definition should be constrained with the generic lifetime parameter.
128+
For example:
129+
130+
```rust
131+
# use scylla_cql::frame::response::result::{NativeType, ColumnType};
132+
# use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
133+
# use scylla_cql::deserialize::value::DeserializeValue;
134+
use thiserror::Error;
135+
struct MySlice<'a>(&'a [u8]);
136+
#[derive(Debug, Error)]
137+
enum MyDeserError {
138+
#[error("Expected bytes")]
139+
ExpectedBytes,
140+
#[error("Expected non-null")]
141+
ExpectedNonNull,
142+
}
143+
impl<'a, 'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MySlice<'a>
144+
where
145+
'frame: 'a,
146+
{
147+
fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
148+
if let ColumnType::Native(NativeType::Blob) = typ {
149+
return Ok(());
150+
}
151+
Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
152+
}
153+
154+
fn deserialize(
155+
_typ: &'metadata ColumnType<'metadata>,
156+
v: Option<FrameSlice<'frame>>,
157+
) -> Result<Self, DeserializationError> {
158+
v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
159+
.map(|v| Self(v.as_slice()))
160+
}
161+
}
162+
```
163+
164+
## Reference-counted types
165+
166+
Internally, the driver uses the `bytes::Bytes` type to keep the contents
167+
of the serialized response. It supports creating derived `Bytes` objects
168+
which point to a subslice but keep the whole, original `Bytes` object alive.
169+
170+
During deserialization, a type can obtain a `Bytes` subslice that points
171+
to the serialized value. This approach combines advantages of the previous
172+
two approaches - creating a derived `Bytes` object can be cheaper than
173+
allocation and a copy (it supports `Arc`-like semantics) and the `Bytes`
174+
type is not constrained by a lifetime. However, you should be aware that
175+
the subslice will keep the whole `Bytes` object that holds the frame alive.
176+
It is not recommended to use this approach for long-living objects because
177+
it can introduce space leaks.
178+
179+
Example:
180+
181+
```rust
182+
# use scylla_cql::frame::response::result::{NativeType, ColumnType};
183+
# use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
184+
# use scylla_cql::deserialize::value::DeserializeValue;
185+
# use bytes::Bytes;
186+
use thiserror::Error;
187+
struct MyBytes(Bytes);
188+
#[derive(Debug, Error)]
189+
enum MyDeserError {
190+
#[error("Expected bytes")]
191+
ExpectedBytes,
192+
#[error("Expected non-null")]
193+
ExpectedNonNull,
194+
}
195+
impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyBytes {
196+
fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
197+
if let ColumnType::Native(NativeType::Blob) = typ {
198+
return Ok(());
199+
}
200+
Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
201+
}
202+
203+
fn deserialize(
204+
_typ: &'metadata ColumnType<'metadata>,
205+
v: Option<FrameSlice<'frame>>,
206+
) -> Result<Self, DeserializationError> {
207+
v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
208+
.map(|v| Self(v.to_bytes()))
209+
}
210+
}
211+
```

0 commit comments

Comments
 (0)