Skip to content

Commit b0afc83

Browse files
RUST-2196 Require ignored bits in packed bit vector to be 0 (#586)
1 parent 5622ad4 commit b0afc83

File tree

4 files changed

+108
-13
lines changed

4 files changed

+108
-13
lines changed

src/binary/vector.rs

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,18 +87,30 @@ impl PackedBitVector {
8787
/// ```
8888
///
8989
/// Padding must be within 0-7 inclusive. Padding must be 0 or unspecified if the provided
90-
/// vector is empty.
90+
/// vector is empty. The ignored bits in the vector must all be 0.
9191
pub fn new(vector: Vec<u8>, padding: impl Into<Option<u8>>) -> Result<Self> {
9292
let padding = padding.into().unwrap_or(0);
9393
if !(0..8).contains(&padding) {
9494
return Err(Error::binary(format!(
9595
"vector padding must be within 0-7 inclusive, got {padding}"
9696
)));
9797
}
98-
if padding != 0 && vector.is_empty() {
99-
return Err(Error::binary(format!(
100-
"cannot specify non-zero padding if the provided vector is empty, got {padding}",
101-
)));
98+
match vector.last() {
99+
Some(last) => {
100+
if last.trailing_zeros() < u32::from(padding) {
101+
return Err(Error::binary(
102+
"the ignored bits in a packed bit vector must all be 0",
103+
));
104+
}
105+
}
106+
None => {
107+
if padding != 0 {
108+
return Err(Error::binary(format!(
109+
"cannot specify non-zero padding if the provided vector is empty, got \
110+
{padding}"
111+
)));
112+
}
113+
}
102114
}
103115
Ok(Self { vector, padding })
104116
}

src/tests/spec/json/bson-binary-vector/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,69 @@ MUST assert that the input float array is the same after encoding and decoding.
5555
- if the canonical_bson field is present, raise an exception when attempting to deserialize it into the corresponding
5656
numeric values, as the field contains corrupted data.
5757

58+
## Prose Tests
59+
60+
### Treatment of non-zero ignored bits
61+
62+
All drivers MUST test encoding and decoding behavior according to their design and version. For drivers that haven't
63+
been completed, raise exceptions in both cases. For those that have, update to this behavior according to semantic
64+
versioning rules, and update tests accordingly.
65+
66+
In both cases, [255], a single byte PACKED_BIT vector of length 1 (hence padding of 7) provides a good example to use,
67+
as all of its bits are ones.
68+
69+
#### 1. Encoding
70+
71+
- Test encoding with non-zero ignored bits. Use the driver API that validates vector metadata.
72+
- If the driver validates ignored bits are zero (preferred), expect an error. Otherwise expect the ignored bits are
73+
preserved.
74+
75+
```python
76+
with pytest.raises(ValueError):
77+
Binary.from_vector([0b11111111], BinaryVectorDtype.PACKED_BIT, padding=7)
78+
```
79+
80+
### 2. Decoding
81+
82+
- Test the behaviour of your driver when one attempts to decode from binary to vector.
83+
- e.g. As of pymongo 4.14, a warning is raised. From 5.0, it will be an exception.
84+
85+
```python
86+
b = Binary(b'\x10\x07\xff', subtype=9)
87+
with pytest.warns():
88+
Binary.as_vector(b)
89+
```
90+
91+
Drivers MAY skip this test if they choose not to implement a `Vector` type.
92+
93+
### 3. Comparison
94+
95+
Once we can guarantee that all ignored bits are non-zero, then equality can be tested on the binary subtype. Until then,
96+
equality is ambiguous, and depends on whether one compares by bits (uint1), or uint8. Drivers SHOULD test equality
97+
behavior according to their design and version.
98+
99+
For example, in `pymongo < 5.0`, we define equality of a BinaryVector by matching padding, dtype, and integer. This
100+
means that two single bit vectors in which 7 bits are ignored do not match unless all bits match. This mirrors what the
101+
server does.
102+
103+
```python
104+
b1 = Binary(b'\x10\x07\x80', subtype=9) # 1-bit vector with all 0 ignored bits.
105+
b2 = Binary(b'\x10\x07\xff', subtype=9) # 1-bit vector with all 1 ignored bits.
106+
b3 = Binary.from_vector([0b10000000], BinaryVectorDtype.PACKED_BIT, padding=7) # Same data as b1.
107+
108+
v1 = Binary.as_vector(b1)
109+
v2 = Binary.as_vector(b2)
110+
v3 = Binary.as_vector(b3)
111+
112+
assert b1 != b2 # Unequal at naive Binary level
113+
assert v2 != v1 # Also chosen to be unequal at BinaryVector level as [255] != [128]
114+
assert b1 == b3 # Equal at naive Binary level
115+
assert v1 == v3 # Equal at the BinaryVector level
116+
```
117+
118+
Drivers MAY skip this test if they choose not to implement a `Vector` type, or the type does not support comparison, or
119+
the type cannot be constructed with non-zero ignored bits.
120+
58121
## FAQ
59122

60123
- What MongoDB Server version does this apply to?

src/tests/spec/json/bson-binary-vector/packed_bit.json

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,22 +21,22 @@
2121
"canonical_bson": "1600000005766563746F7200040000000910007F0700"
2222
},
2323
{
24-
"description": "Empty Vector PACKED_BIT",
24+
"description": "PACKED_BIT with padding",
2525
"valid": true,
26-
"vector": [],
26+
"vector": [127, 8],
2727
"dtype_hex": "0x10",
2828
"dtype_alias": "PACKED_BIT",
29-
"padding": 0,
30-
"canonical_bson": "1400000005766563746F72000200000009100000"
29+
"padding": 3,
30+
"canonical_bson": "1600000005766563746F7200040000000910037F0800"
3131
},
3232
{
33-
"description": "PACKED_BIT with padding",
33+
"description": "Empty Vector PACKED_BIT",
3434
"valid": true,
35-
"vector": [127, 7],
35+
"vector": [],
3636
"dtype_hex": "0x10",
3737
"dtype_alias": "PACKED_BIT",
38-
"padding": 3,
39-
"canonical_bson": "1600000005766563746F7200040000000910037F0700"
38+
"padding": 0,
39+
"canonical_bson": "1400000005766563746F72000200000009100000"
4040
},
4141
{
4242
"description": "Overflow Vector PACKED_BIT",

src/tests/spec/vector.rs

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,3 +210,23 @@ fn run_test_file(test_file: TestFile) {
210210
fn run_vector_tests() {
211211
run_spec_test(&["bson-binary-vector"], run_test_file);
212212
}
213+
214+
#[test]
215+
fn non_zero_ignored_bits() {
216+
// 1. Encoding
217+
let error = PackedBitVector::new(vec![u8::MAX], 1).unwrap_err();
218+
assert!(error
219+
.message
220+
.is_some_and(|message| message.contains("must all be 0")));
221+
222+
// 2. Decoding
223+
let bytes = vec![PACKED_BIT, 1, u8::MAX];
224+
let error = Vector::from_bytes(bytes).unwrap_err();
225+
assert!(error
226+
.message
227+
.is_some_and(|message| message.contains("must all be 0")));
228+
229+
// 3. Comparison
230+
// This test is not implemented because it is not possible to construct a Vector with non-zero
231+
// padding bits.
232+
}

0 commit comments

Comments
 (0)