Skip to content

Conversation

pablorfb-meta
Copy link
Contributor

Summary:
For an unknown reason, small payloads consistently perform better in Python compared to Rust (D79577855), eventhough the messaging mechanism is powered by port handles and receivers on both implementations.

Once payload size increases, Rust outperforms Python (as expected)

Additionally, Python cannot reliable cast 1Gb of data during the bechmakr without throwing a bunch of errors, so it is excluded from benchmark for now.

Benchmark Time [Min, Median, Max] (ms) Python Throughput [MiB/s] Rust Throughput [MiB/s] Throughput Change %
hosts/1/size/10kb [0, 0, 4] 17 8.4220 +101.8%
hosts/1/size/100kb [0, 0, 4] 140 79.577 +75.9%
hosts/1/size/1mb [1, 2, 9] 433 535.84 -19.2%
hosts/1/size/10mb [18, 20, 33] 466 494.50 -5.8%
hosts/1/size/100mb [202, 223, 318] 443 518.56 -14.6%
hosts/10/size/10kb [1, 1, 59] 67 75.318 -11.1%
hosts/10/size/100kb [1, 1, 59] 537 678.38 -20.9%
hosts/10/size/1mb [4, 6, 70] 1451 2843.14 -49.0%
hosts/10/size/10mb [52, 61, 145] 1575 2243.11 -29.8%
hosts/10/size/100mb [677, 720, 905] 1353 2152.97 -37.1%

Differential Revision: D80100828

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 12, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80100828

pablorfb-meta added a commit to pablorfb-meta/monarch that referenced this pull request Aug 20, 2025
Summary:

For an unknown reason (needs deeper dive), small payloads consistently perform better in Python compared to Rust (D79577855), eventhough the messaging mechanism is powered by port handles and receivers on both implementations. 

Once payload size increases, Rust outperforms Python (as expected)

Additionally, Python cannot reliable cast 1Gb of data during the bechmakr without throwing a bunch of errors, so it is excluded from benchmark for now.

| Benchmark           | Time [Min, Median, Max] (ms) | Python Throughput [MiB/s] | Rust Throughput [MiB/s] | Throughput Change % |
|---------------------|------------------------------|--------------------------|---------------------------|-------------------------------|
| hosts/1/size/10kb   | [0, 0, 4]                    | 17                       | 8.4220                    | +101.8%                       |
| hosts/1/size/100kb  | [0, 0, 4]                    | 140                      | 79.577                    | +75.9%                        |
| hosts/1/size/1mb    | [1, 2, 9]                    | 433                      | 535.84                    | -19.2%                        |
| hosts/1/size/10mb   | [18, 20, 33]                 | 466                      | 494.50                    | -5.8%                         |
| hosts/1/size/100mb  | [202, 223, 318]              | 443                      | 518.56                    | -14.6%                        |
| hosts/10/size/10kb  | [1, 1, 59]                   | 67                       | 75.318                    | -11.1%                        |
| hosts/10/size/100kb | [1, 1, 59]                   | 537                      | 678.38                    | -20.9%                        |
| hosts/10/size/1mb   | [4, 6, 70]                   | 1451                     | 2843.14                   | -49.0%                        |
| hosts/10/size/10mb  | [52, 61, 145]                | 1575                     | 2243.11                   | -29.8%                        |
| hosts/10/size/100mb | [677, 720, 905]              | 1353                     | 2152.97                   | -37.1%                        |

Reviewed By: pzhan9

Differential Revision: D80100828
Summary:

Measures how long it takes to cast and reply a 1Kb msg to N hosts with 8 actors via local transport
100ms of processing time (included in benchmark below) was added to the tested endpoint

| Benchmark Name           | Python p50 Latency (ms) | Rust p50 Latency  (ms) | % Diff |  % Diff w/o proc time|
|-------------------------|-------------------------------|-------------------------|---------------------------------|-----------------------|
| actor_count_1_median_ms  | 102                           | 101                  | .01%                           | 516%                  |
| actor_count_10_median_ms | 108                           | 101                 | .69%                          | 674%                  |
| actor_count_100_median_ms| 145                           | 102                 | 40.81%                          | 1510%                 |

* Crashes on 1k "host" set up

Reviewed By: pzhan9

Differential Revision: D79769681
Summary:

For an unknown reason (needs deeper dive), small payloads consistently perform better in Python compared to Rust (D79577855), eventhough the messaging mechanism is powered by port handles and receivers on both implementations. 

Once payload size increases, Rust outperforms Python (as expected)

Additionally, Python cannot reliable cast 1Gb of data during the bechmakr without throwing a bunch of errors, so it is excluded from benchmark for now.

| Benchmark           | Time [Min, Median, Max] (ms) | Python Throughput [MiB/s] | Rust Throughput [MiB/s] | Throughput Change % |
|---------------------|------------------------------|--------------------------|---------------------------|-------------------------------|
| hosts/1/size/10kb   | [0, 0, 4]                    | 17                       | 8.4220                    | +101.8%                       |
| hosts/1/size/100kb  | [0, 0, 4]                    | 140                      | 79.577                    | +75.9%                        |
| hosts/1/size/1mb    | [1, 2, 9]                    | 433                      | 535.84                    | -19.2%                        |
| hosts/1/size/10mb   | [18, 20, 33]                 | 466                      | 494.50                    | -5.8%                         |
| hosts/1/size/100mb  | [202, 223, 318]              | 443                      | 518.56                    | -14.6%                        |
| hosts/10/size/10kb  | [1, 1, 59]                   | 67                       | 75.318                    | -11.1%                        |
| hosts/10/size/100kb | [1, 1, 59]                   | 537                      | 678.38                    | -20.9%                        |
| hosts/10/size/1mb   | [4, 6, 70]                   | 1451                     | 2843.14                   | -49.0%                        |
| hosts/10/size/10mb  | [52, 61, 145]                | 1575                     | 2243.11                   | -29.8%                        |
| hosts/10/size/100mb | [677, 720, 905]              | 1353                     | 2152.97                   | -37.1%                        |

Reviewed By: pzhan9

Differential Revision: D80100828
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80100828

pablorfb-meta added a commit to pablorfb-meta/monarch that referenced this pull request Aug 21, 2025
Summary:

For an unknown reason (needs deeper dive), small payloads consistently perform better in Python compared to Rust (D79577855), eventhough the messaging mechanism is powered by port handles and receivers on both implementations. 

Once payload size increases, Rust outperforms Python (as expected)

Additionally, Python cannot reliable cast 1Gb of data during the bechmakr without throwing a bunch of errors, so it is excluded from benchmark for now.

| Benchmark           | Time [Min, Median, Max] (ms) | Python Throughput [MiB/s] | Rust Throughput [MiB/s] | Throughput Change % |
|---------------------|------------------------------|--------------------------|---------------------------|-------------------------------|
| hosts/1/size/10kb   | [0, 0, 4]                    | 17                       | 8.4220                    | +101.8%                       |
| hosts/1/size/100kb  | [0, 0, 4]                    | 140                      | 79.577                    | +75.9%                        |
| hosts/1/size/1mb    | [1, 2, 9]                    | 433                      | 535.84                    | -19.2%                        |
| hosts/1/size/10mb   | [18, 20, 33]                 | 466                      | 494.50                    | -5.8%                         |
| hosts/1/size/100mb  | [202, 223, 318]              | 443                      | 518.56                    | -14.6%                        |
| hosts/10/size/10kb  | [1, 1, 59]                   | 67                       | 75.318                    | -11.1%                        |
| hosts/10/size/100kb | [1, 1, 59]                   | 537                      | 678.38                    | -20.9%                        |
| hosts/10/size/1mb   | [4, 6, 70]                   | 1451                     | 2843.14                   | -49.0%                        |
| hosts/10/size/10mb  | [52, 61, 145]                | 1575                     | 2243.11                   | -29.8%                        |
| hosts/10/size/100mb | [677, 720, 905]              | 1353                     | 2152.97                   | -37.1%                        |

Reviewed By: pzhan9

Differential Revision: D80100828
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 3ebb329.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants