Improve (de-)serialization performance for scalar arrays

One of my personal use cases for `betterproto` is to call `SciPy` functions from other languages, which do not have such nice math libraries. That is, I have a Python grpc service which essentially is receiving large float arrays, doing math with them, and sending large result arrays in return. Unfortunately, serializing and deserializing (numpy) arrays with `betterproto` does not seem to be as efficient, as it could be.

When serializing a float array to the protobuf format, the serialized protobuf message happens to be exactly in the right byte format to be interpreted as a numpy array, as the following example shows:

```python3
@dataclass
class Array(betterproto.Message):
    values: List[float] = betterproto.double_field(1)

proto_array = Array(values=[1.23, 2.34, 3.45, 4.56])
serialized_array = bytes(proto_array)
np_array = np.frombuffer(serialized_array[2:])

print(np_array)
```

However, when deserializing the protobuf message with `betterproto`, the array is converted into a Python list right away. If I need a numpy array, I have no choice but to convert it back and forth (same for serialization) which is computationally expensive.

I have two ideas for solving that issue:

**Idea 1:** Instead of storing protobuf scalar arrays as Python lists, you could store their byte representation inside a slim wrapper which behaves like a list, but which can also be converted into a numpy array without efforts:

```python3
class Float64Array:
    __data: bytes

    def __len__(self):
        ...
    
    def __getitem__(self, i):
        ...

    def to_numpy_array(self):
        import numpy as np
        return np.frombuffer(self.__data)
```

**Idea 2:** You could introduce an optional protoc compiler flag for letting the caller decide whether scalar arrays should be stored as Python lists or as numpy arrays.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve (de-)serialization performance for scalar arrays #515

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve (de-)serialization performance for scalar arrays #515

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions