What makes loading the NumPy array 5x slower than the Python list? #4732

christianrickert · 2024-11-26T18:49:25Z

christianrickert
Nov 26, 2024

I've run into an unexpected performance bottleneck with pyo3 and numpy: In short, loading a large 2-dimensional ndarrayinto a Rust function takes about five times longer than a nested Python list.

# Cargo.toml

[package]
name = "pyo3np"
version = "0.1.0"
edition = "2021"

[lib]
name = "pyo3np"
crate-type = ["cdylib"]

[dependencies]
pyo3 = "0.23.1"

# pyproject.toml

[build-system]
requires = ["maturin>=1.7,<2.0"]
build-backend = "maturin"

[project]
name = "pyo3np"
requires-python = ">=3.8"
classifiers = [
    "Programming Language :: Rust",
    "Programming Language :: Python :: Implementation :: CPython",
    "Programming Language :: Python :: Implementation :: PyPy",
]
dynamic = ["version"]

[tool.maturin]
profile = "release"
features = ["pyo3/extension-module"]

// lib.rs

use pyo3::prelude::*;

#[pyfunction]
fn return_vector(_input: Vec<Vec<f64>>) -> PyResult<()> {
    Ok(())
}

#[pymodule]
fn pyo3np(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(return_vector, m)?)?;
    Ok(())
}

# load_data.py

import pyo3np
import numpy as np
import time

# create random data
print("Creating data...", flush=True, end='')
np.random.seed(42)
input = np.random.rand(10, 1024 * 1024 * 100)#.tolist()
print("done.")

# load random data
print("Loading data...", flush=True, end='')
start = time.time()
pyo3np.return_vector(_input=input)
duration = time.time() - start
print(f"{duration}s.")

It takes about 25 seconds if I pass the ndarray to the return_vector Rust function:

Creating data...done.
Loading data...24.40405297279358s.
Creating data...done.
Loading data...25.127217292785645s.
Creating data...done.
Loading data...25.697571754455566s.

In contrast, it only takes about 6 seconds to load the same data as a Python list:

Creating data...done.
Loading data...5.6855690479278564s.
Creating data...done.
Loading data...5.770709037780762s.
Creating data...done.
Loading data...5.724390983581543s.

However, converting the ndarray to a Python list comes at the cost of (mostly) making a copy in memory, even without returning the processed data. - I did have a look at the simple example for rust-numpy, but it adds a significant level of verbosity to the Rust code.

I'm glad that pyo3 works out of the box with both ndarray and Python lists - even without any changes to the Rust code! But is there something I missed that could explain the difference in performance?

Answered by alex

Nov 26, 2024

The reason is that to go from a numpy array to a Vec of f64, a new pyinteger object is allocated and then unboxed for each value. With a list the pyobjects already exist

View full answer

davidhewitt · 2024-11-26T20:20:02Z

davidhewitt
Nov 26, 2024
Maintainer

I don't see anything about ndarray in your snippet, so I'm a bit confused. In general anything with 2d arrays you will want to pass them around as numpy / ndarray arrays, not vecs-of-vecs.

1 reply

christianrickert Nov 26, 2024
Author

thank you @davidhewitt

I don't see anything about ndarray in your snippet, so I'm a bit confused.

# load_data.py

np.random.rand(10, 1024 * 1024 * 100)           # type() returns: <class 'numpy.ndarray'>
np.random.rand(10, 1024 * 1024 * 100).tolist()  # type() returns: <class 'list'>

I didn't want to copy/paste for a single function call tolist(), but I understand that it is too easy to miss. - Apologies!

In general anything with 2d arrays you will want to pass them around as numpy / ndarray arrays, not vecs-of-vecs.

I would really like to continue using Vec<Vec<f64>> on the Rust side if at all possible.

However, there must be a reason (implementation detail like repetitive type checks or iterative memory allocations) why the conversion to Vec<Vec<f64>> performs significantly worse for an ndarray than for a list that I am currently missing. Quite frankly, it doesn't make sense to me.

alex · 2024-11-26T23:01:18Z

alex
Nov 26, 2024
Collaborator

The reason is that to go from a numpy array to a Vec of f64, a new pyinteger object is allocated and then unboxed for each value. With a list the pyobjects already exist

…

On Tue, Nov 26, 2024, 5:52 PM Christian Rickert ***@***.***> wrote: thank you @davidhewitt <https://github.com/davidhewitt> I don't see anything about ndarray in your snippet, so I'm a bit confused. # load_data.py np.random.rand(10, 1024 * 1024 * 100) # type() returns: <class 'numpy.ndarray'>np.random.rand(10, 1024 * 1024 * 100).tolist() # type() returns: <class 'list'> I didn't want to copy/paste for a single function call tolist(), but I understand that it is too easy to miss. - Apologies! In general anything with 2d arrays you will want to pass them around as numpy / ndarray arrays, not vecs-of-vecs. I would really like to continue using Vec<Vec<f64>> on the Rust side if at all possible. However, there must be a reason (implementation detail like repetitive type checks or iterative memory allocations) why the conversion to Vec<Vec<f64>> performs significantly worse for an ndarray than for a list that I am currently missing. Quite frankly, it doesn't make sense to me. — Reply to this email directly, view it on GitHub <#4732 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAAGBHLU6NHVFUCHLYGV3T2CT3RNAVCNFSM6AAAAABSRD7C2SVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMZYHA4DMMI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

1 reply

christianrickert Nov 26, 2024
Author

yep, that would do it!

Thank you both @davidhewitt and @alex.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What makes loading the NumPy array 5x slower than the Python list? #4732

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What makes loading the NumPy array 5x slower than the Python list? #4732

Uh oh!

Uh oh!

christianrickert Nov 26, 2024

Replies: 2 comments · 2 replies

Uh oh!

davidhewitt Nov 26, 2024 Maintainer

Uh oh!

christianrickert Nov 26, 2024 Author

Uh oh!

alex Nov 26, 2024 Collaborator

Uh oh!

christianrickert Nov 26, 2024 Author

christianrickert
Nov 26, 2024

Replies: 2 comments 2 replies

davidhewitt
Nov 26, 2024
Maintainer

christianrickert Nov 26, 2024
Author

alex
Nov 26, 2024
Collaborator

christianrickert Nov 26, 2024
Author