-
Notifications
You must be signed in to change notification settings - Fork 68
Description
In a recent blog post using benchee by @alvises I wondered why the author didn't use benchee for memory measurements so I reached out and it happens that what we measure doesn't seem to be what was important here:
I had an issue though I actually still don't understand. Benchmarking this https://bit.ly/2ChIS1X code, benchess says 2.16GB memory usage, while the observer doesn't show any memory peak (around 32MBytes peak). Maybe I've put some wrong settings..
The thing is that we measure the total memory allocated - even the memory that was garbage collected which is what we want :) However, it'd be interesting to see a couple of others:
- maximum heap size of the process (aka how far does this push the memory consumption of my erlang process) - here streaming is a great example because we still deal with all the data just not at once (hence it is garbage collected in between). As such the sample and the source for the large CSV seem helpful for testing (doesn't seem to allow downloads for non users)
- as @michalmuskala pointed out retained memory might also be worth measuring - so what's the long term memory impact this operation has on my system (for stateless functions this should hopefully be 0)
The question is of course how to implement this - as our memory measurer already has all the data the simplest solution might be to just let it return the 3 values in a tuple or so.
Another interesting thing is the output - for a lot of use cases (pure functions) I expect all of these values to not change so instead of printing 3 different sections we could just output "this value this, that value that and the other value that"
This is a post 1.0 thing, it's very nice but we should really get 1.0 out and it shouldn't be breaking to the outside as we'd just add fields/data not remove or rename.