Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions Lib/test/test_memoryview.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,24 @@ def test_iter(self):
m = self._view(b)
self.assertEqual(list(m), [m[i] for i in range(len(m))])

def test_contains(self):
for tp in self._types:
b = tp(self._source)
m = self._view(b)
for c in list(m):
with self.subTest(self._source, buffer_type=tp, item=c):
self.assertIn(c, m)

with self.subTest('empty buffer'):
empty = tp(b'')
mview = self._view(empty)
self.assertNotIn(0, mview)

with self.subTest('not found'):
b = tp(b'abc')
m = self._view(b)
self.assertNotIn(ord('d'), m)

def test_count(self):
for tp in self._types:
b = tp(self._source)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Implement :meth:`~object.__contains__` for :class:`memoryview` objects.
Patch by Bénédikt Tran.
38 changes: 34 additions & 4 deletions Objects/memoryobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -2483,6 +2483,37 @@ memory_item_multi(PyMemoryViewObject *self, PyObject *tup)
return unpack_single(self, ptr, fmt);
}

/* Test the membership of an item. */
static int
memory_contains(PyObject *self, PyObject *value)
{
PyObject *iter = PyObject_GetIter(self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to follow the pattern of already implemented methods and have it a bit faster.

Creating iterators when they aren't needed might not be the best option in tensor-like-object.

E.g. I don't think numpy does obj in array.flat for a in array.

I think it is much easier to do it from the beginning compared to all convincing that will be required to change this later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best reference for such decisions for memoryview is numpy.array.
Although it is still very primitive in comparison, but it is possible this will change with time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said on the issue, performances should be addressed in a follow-up PR. Note that this pattern is the pattern used by list objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said on the issue, performances should be addressed in a follow-up PR.

I don't think this is only about the performance, but also about the design.

Also, such follow-up PR would pretty much replace the whole implementation of this PR.

Note that this pattern is the pattern used by list objects.

What I am suggesting is that memoryview should be modelled after tensor-like objects and not CPython sequences such as list or tuple. It is slightly different breed, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, memoryviews are only 1-dimensional =/ we don't have multi-dimensional slicing. And it's been like this for ages. Tensors are generally used for generalizing vectors and matrices, but for now, we do not support them at all.

I can try to use lower-level calls, but this would overcomplicate the implementation itself. My original idea was to inline most of the calls to avoid materializing an iterator and advance item by item manually. But I expected the implementation to be harder to maintain. As Jelle said, in already worked because it delegated to PySequence_Contains. What we needed in this PR was to have an explicit Sequence.__contains__ (same for Sequence.__reversed__ of the other PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the "natural evolution" hasn't changed for the past 10 years...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I personally don't mind updating what I wrote later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the "natural evolution" hasn't changed for the past 10 years...

It did not evolve much over past 10 years, yes. From my POV, past might not be a very good indicator for the future in this case. I think there are some overlooked opportunities here.

I am personally interested in multi-dimensional slicing of memoryview and there is a reasonable chance that I will make some attempts in reasonably near future. I have been thinking about it approximately since https://discuss.python.org/t/memoryview-multi-dimensional-slicing-support/52776.

And I personally don't mind updating what I wrote later.

This doesn't make much difference, whether it is you or someone else, the biggest cost here is new PR, review process, time delays, etc. Implementation itself would be a minor issue here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

past might not be a very good indicator for the future in this case.

Unfortunately, it is for most core devs. I will not change my stance on this matter for now, because my aim was to port the implicit dispatch (remember that in already works because it dispatches to Sequence_Contains) to an explicit one. If you want an implementation using the index-based approach, feel free to open an alternate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not as big of a deal as it might seem that I am trying to make. :) Given it happens, this will most likely be changed as part of m-dim-slicing by whoever will be doing that.

But I am still finding trouble to see why not implement it more correctly from the beginning given such low cost. :)

if (iter == NULL) {
return -1;
}

PyObject *item = NULL;
while (PyIter_NextItem(iter, &item)) {
if (item == NULL) {
Py_DECREF(iter);
return -1;
}
if (item == value) {
Py_DECREF(item);
Py_DECREF(iter);
return 1;
}
int contained = PyObject_RichCompareBool(item, value, Py_EQ);
Py_DECREF(item);
if (contained != 0) {
Py_DECREF(iter);
return contained;
}
}
Py_DECREF(iter);
return 0;
}

static inline int
init_slice(Py_buffer *base, PyObject *key, int dim)
{
Expand Down Expand Up @@ -2741,10 +2772,9 @@ static PyMappingMethods memory_as_mapping = {

/* As sequence */
static PySequenceMethods memory_as_sequence = {
memory_length, /* sq_length */
0, /* sq_concat */
0, /* sq_repeat */
memory_item, /* sq_item */
.sq_length = memory_length,
.sq_item = memory_item,
.sq_contains = memory_contains
};


Expand Down
Loading