Performance issue #3904

Thopic · 2024-02-26T13:58:07Z

Thopic
Feb 26, 2024

Hi everyone,

Thanks for the amazing work on pyo3! I'm currently adding python binding to one of my library, and I'm trying to understand a huge drop in performance (100x) between two versions of my code.

In short, I have a Rust struct Model. In Rust, I load some file into it using Model::load_file then run the computationally expensive Model::infer.

For the python bindings, I created a PyModel struct:

use mylib::Model;
use anyhow::Result;

#[pyclass(name="Model")]
struct PyModel {
    inner: Model,
}

#[pymethods]
impl PyModel {
  #[staticmethod]
    pub fn load_file(
        file: &str,
    ) -> Result<PyModel> { // I'm using anyhow
        let m = Model::load_file(
            Path::new(file),
        )?;
        Ok(PyModel { inner: m })
    }
    
    pub fn infer(&mut self) -> Result<()> {
        // Performance bottleneck, lots of loop + fairly high memory usage
        self.inner.infer()?;
        Ok(())
    }
}

#[pyfunction]
pub fn test() -> Result<()> {
    let mut model = Model::load_file(Path::new("myfile.txt"))?;
    model.infer()?;
    Ok(())
}

When I compile (maturing develop -r), the following python code:

import mylib
mylib.test()

is 100x faster (400ms vs 40s) than this code:

import mylib
m = mylib.Model.load_file("myfile.txt")
m.infer()

Sorry for not providing a minimal example, it's a bit complicated with this type of performance issue. I've also partially simplified the code, infer takes some arguments in reality (a Vec<String>, and some int parameters).

Additional info:

The original rust code also spends less than one second on the same problem
I haven't forgotten --release. The first code compiled with --dev run 10x slower (so still 10x faster than the 2nd).
It's not a Python/Rust object conversion issue / a function call overhead issue, profiling shows that all the time is spent inside the Model::infer() function.
The original code is not parallel.

That's it, I'd appreciate any ideas or potential tests I could do. Thanks a lot!

EDIT: Adding flamegraphs

Answered by davidhewitt

Feb 27, 2024

Given it's reportedly spending all the time in your inference loop, maybe print some basic stats about number of iterations, check the input data is the same, check the inference result is the same?

View full answer

adamreichold · 2024-02-26T18:45:17Z

adamreichold
Feb 26, 2024
Maintainer

I assume doing the .clone() dance within test does not materially change things?

1 reply

Thopic Feb 26, 2024
Author

No, sadly. It was just a test to see if it changes anything (does not). Removing it for clarity

davidhewitt · 2024-02-26T20:17:17Z

davidhewitt
Feb 26, 2024
Maintainer

Are you able to attach a flamegraph or any other profile data ? There's nothing immediately obvious, but more information may spark ideas.

1 reply

Thopic Feb 27, 2024
Author

I edited my question to add them (both in debug & release mode, using perf), they're not extremely informative, the slow version spends most of its time in the function ihor::vdj::inference::Features::infer_given_vdj (one of the function called by mylib::infer).

birkenfeld · 2024-02-27T07:32:02Z

birkenfeld
Feb 27, 2024
Collaborator

If you didn't say it also happens in a debug build, I'd have suggested that the compiler is optimizing away everything that infer() does since test() just throws away its Model after it's done...

0 replies

davidhewitt · 2024-02-27T08:28:00Z

davidhewitt
Feb 27, 2024
Maintainer

Given it's reportedly spending all the time in your inference loop, maybe print some basic stats about number of iterations, check the input data is the same, check the inference result is the same?

1 reply

Thopic Feb 27, 2024
Author

Thanks, I should have done that earlier... I realized that some early configuration option differed between the two runs and explained the difference in speed. Sorry for the waste of time... (and thanks a lot to everyone for the quick answers!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance issue #3904

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Performance issue #3904

Uh oh!

Uh oh!

Thopic Feb 26, 2024

Replies: 4 comments · 3 replies

Uh oh!

adamreichold Feb 26, 2024 Maintainer

Uh oh!

Thopic Feb 26, 2024 Author

Uh oh!

davidhewitt Feb 26, 2024 Maintainer

Uh oh!

Thopic Feb 27, 2024 Author

Uh oh!

birkenfeld Feb 27, 2024 Collaborator

Uh oh!

davidhewitt Feb 27, 2024 Maintainer

Uh oh!

Thopic Feb 27, 2024 Author

Thopic
Feb 26, 2024

Replies: 4 comments 3 replies

adamreichold
Feb 26, 2024
Maintainer

Thopic Feb 26, 2024
Author

davidhewitt
Feb 26, 2024
Maintainer

Thopic Feb 27, 2024
Author

birkenfeld
Feb 27, 2024
Collaborator

davidhewitt
Feb 27, 2024
Maintainer

Thopic Feb 27, 2024
Author