Skip to content

Comments

Add framework to measure code base performance ✨ πŸ’Ž #558

Merged
RobertoPrevato merged 14 commits intomainfrom
feat/benchmarks
May 10, 2025
Merged

Add framework to measure code base performance ✨ πŸ’Ž #558
RobertoPrevato merged 14 commits intomainfrom
feat/benchmarks

Conversation

@RobertoPrevato
Copy link
Member

Tip

The benchmark-reports artifacts include Excel files with tables and charts.

Build

@RobertoPrevato RobertoPrevato merged commit 529d5e4 into main May 10, 2025
12 checks passed
@RobertoPrevato RobertoPrevato deleted the feat/benchmarks branch May 10, 2025 23:44
@bymoye
Copy link
Contributor

bymoye commented May 11, 2025

Blacksheep performs best in Python 3.11! Why not 3.13?

@RobertoPrevato
Copy link
Member Author

@bymoye I was surprised, too. I have no idea. But I think it must be related to CPython itself, because I executed the tests on the same commit. I am pretty satisfied with what I did with these benchmarks, the worklow is pretty cool! πŸ˜„

@RobertoPrevato
Copy link
Member Author

However, the best performance in terms of memory consumption is with:

3.12.10 Windows-2022Server-10.0.20348-SP0
3.11.9 Windows-10-10.0.20348-SP0

πŸ‘πŸΌ


The worse in terms of memoy consumption:

3.13.3 Linux-6.11.0-1013-azure-x86_64-with-glibc2.39
3.13.3 macOS-14.7.5-arm64-arm-64bit-Mach-O

πŸ‘ŽπŸΌ

@bymoye
Copy link
Contributor

bymoye commented May 11, 2025

However, the best performance in terms of memory consumption is with:

3.12.10 Windows-2022Server-10.0.20348-SP0
3.11.9 Windows-10-10.0.20348-SP0
πŸ‘πŸΌ

The worse in terms of memoy consumption:

3.13.3 Linux-6.11.0-1013-azure-x86_64-with-glibc2.39
3.13.3 macOS-14.7.5-arm64-arm-64bit-Mach-O
πŸ‘ŽπŸΌ

This is very strange... I was under the impression that Linux systems should perform best... maybe it has something to do with the memory management mechanism of each system?

@RobertoPrevato
Copy link
Member Author

RobertoPrevato commented May 11, 2025

I find it strange, too. The specs for GitHub runners are the same for Windows and Linux. macOS has less resources and I got better results at each run.

@bymoye
Copy link
Contributor

bymoye commented May 11, 2025

Hi, @RobertoPrevato

I did some research on the possible reason, maybe the memory_profiler results between different platforms cannot be compared together. Because memory_profiler uses psutil, and psutil works differently in Linux and Windows and MacOS. psutil-windows-process-memory-usage

Related documentation: memory_full_info

Maybe you should use mem_usage = memory_usage(wrapper, interval=0.01, timeout=30, backend='psutil_uss') to test, it defaults to psutil(rss), using uss will measure the memory unique to the process instead of shared memory.

@RobertoPrevato
Copy link
Member Author

Hi @bymoye
Thank you for the heads up. I fix it asap (I am documenting the new features of 2.3.0 right now).

RobertoPrevato added a commit that referenced this pull request May 11, 2025
Use `export PYTHONOPTIMIZE=1`.
Address #558 (comment)
@RobertoPrevato
Copy link
Member Author

@bymoye backend='psutil_uss' fails on macOS. πŸ‘€ https://github.com/Neoteroi/BlackSheep/actions/runs/14958347197/job/42017019493

I'll check more another day.

@bymoye
Copy link
Contributor

bymoye commented May 12, 2025

@RobertoPrevato

I used psutil to replace the original writing, I am not sure whether it is consistent with the original functional requirements. bymoye@2614f00

The following results were obtained:

https://github.com/bymoye/BlackSheep/actions/runs/14961845051

@RobertoPrevato
Copy link
Member Author

@RobertoPrevato

I used psutil to replace the original writing, I am not sure whether it is consistent with the original functional requirements. bymoye@2614f00

The following results were obtained:

https://github.com/bymoye/BlackSheep/actions/runs/14961845051

Thanks, very much appreciated. You are correct that USS should be used here.

I would add garbage collection before each function execution:

while time.time() - start_time < 30:  # timeout=30 seconds
+   gc.collect()
    wrapper()

The results of the app_handle_text_response_peak_mb look a bit weird.

app_handle_text_response_peak_mb app_handle_text_response_controller_peak_mb
356.65000000 28.05625000
353.15000000 27.10000000
350.40312500 26.65937500
331.13437500 23.27265625
328.94531250 28.15234375
319.26328125 27.78984375
311.47109375 23.67968750
306.60546875 22.79296875
285.61015625 29.01171875

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants