Skip to content

__len__ documentation should state size restriction #5843

@asaites

Description

@asaites

The documentation for the __len__ magic method says the method must return a usize, and indeed PyO3 enforces that restriction; however, the CPython type is Py_ssize_t and further restricted to a non-negative value, so on a 64-bit machine, __len__ must return a value in $[0, 2^{63}-1]$. Though it's documented in Python's data model, I think it's worth adding that note to PyO3's docs, since the usize explanation (and the introductory notes) seem to imply $[2^{64}, 2^{64}-1]$ are OK.

This came up for me in a "virtual sequence" for which the results of __len__ and __getitem__ are calculated on-demand using values supplied when constructing the object.

The conversion code does check for out-of-bound values and reports OverflowError in cases that Python would reject, but it is confusing to track down the source. In the example below, note the equivalent Python error message has a bit more information about the cause, so perhaps it'd be worth matching that message.

Alternatively, you could require __len__ return isize and document that Python requires it be non-negative, but that would be a breaking change for a use that probably doesn't impact most users. The advantage there is that people are probably less likely to try returning a negative value, given the semantics of the method.


Here is some example code that demonstrates the differences in error messages. The overall behavior seems that PyO3 enables the same behavior that one could get in Python; the mismatch is really between the Rust usize and the actual Python maximum __len__.

Rust-based extension:

use pyo3::{prelude::*, types::PySequence};

#[pyclass(sequence)]
struct MyRustSeq {
    length: usize,
}

#[pymethods]
impl MyRustSeq {
    #[new]
    fn __new__(length: usize) -> Self {
        Self { length }
    }

    fn __len__(&self) -> usize {
        self.length
    }

    #[allow(unused_variables)]
    fn __getitem__(&self, index: Bound<'_, PyAny>) -> i32 {
        42
    }
}

#[pyclass(sequence)]
struct WrapperSeq {
    inner: Py<PySequence>,
}

#[pymethods]
impl WrapperSeq {
    #[new]
    fn __new__(inner: Py<PySequence>) -> Self {
        Self { inner }
    }

    fn __len__<'py>(slf: PyRef<'_, Self>, py: Python<'py>) -> PyResult<usize> {
        py.import("builtins")?.getattr("len")?.call1((slf.inner.bind_borrowed(py),))?.extract()
    }

    #[allow(unused_variables)]
    fn __getitem__(&self, index: Bound<'_, PyAny>) -> i32 {
        42
    }
}

#[pymodule]
pub(crate) fn my_module(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_class::<MyRustSeq>()?;
    m.add_class::<WrapperSeq>()?;
    Ok(())
}

Python code and test cases (using pytest):

from collections.abc import Sequence

import pytest

import my_module


class PySeq(Sequence):
    def __init__(self, length):
        self.length = length

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        return 42


class TestRustSeq:
    def test_ok(self):
        s = my_module.MyRustSeq(pow(2, 63)-1)
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        assert len(s) == pow(2, 63)-1

    def test_too_big(self):
        s = my_module.MyRustSeq(pow(2, 63))
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        with pytest.raises(OverflowError, match=("^$")):
            assert len(s) == pow(2, 63)

    def test_neg_len(self):
        with pytest.raises(OverflowError, match=("^can't convert negative int to unsigned$")):
            _ = my_module.MyRustSeq(-1)


class TestPySeq:
    def test_ok(self):
        s = PySeq(pow(2, 63)-1)
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        assert len(s) == pow(2, 63)-1

    def test_too_big(self):
        s = PySeq(pow(2, 63))
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        with pytest.raises(OverflowError, match=("^cannot fit 'int' into an index-sized integer$")):
            assert len(s) == pow(2, 63)

    def test_neg_len(self):
        s = PySeq(-1)
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        with pytest.raises(ValueError, match=(r"^__len__\(\) should return >= 0$")):
            assert len(s) == -1


class TestWrappedSeq:
    def test_ok(self):
        s = my_module.WrapperSeq(PySeq(pow(2, 63)-1))
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        assert len(s) == pow(2, 63)-1

    def test_too_big(self):
        s = my_module.WrapperSeq(PySeq(pow(2, 63)))
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        with pytest.raises(OverflowError, match=("^cannot fit 'int' into an index-sized integer$")):
            assert len(s) == pow(2, 63)

    def test_neg_len(self):
        s = my_module.WrapperSeq(PySeq(-1))
        assert s[0] == s[pow(2, 128)] == s[-1] == 42
        with pytest.raises(ValueError, match=(r"^__len__\(\) should return >= 0$")):
            assert len(s) == -1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions