Fix performance issues for vector-based storage by AnthonyVH · Pull Request #413 · boostorg/graph

AnthonyVH · 2025-01-15T18:14:51Z

None of the functions called inside vertex_by_property() required a mutable graph object, but vertex_by_property() itself did. This is now fixed.
At several locations in the code base there was the equivalent of container.reserve(container.size() + 1), e.g. add_vertex(). Calling such an add_vertex() in a loop with an std::vector backing the vertex storage results in O(N²) time complexity instead of the expected (amortized) O(N). This is fixed by allowing the container to grow as expected.

jeremy-murphy · 2025-01-16T09:31:36Z

Thanks for your pull request.

None of the functions called inside vertex_by_property() required a mutable graph object, but vertex_by_property() itself did. This is now fixed.

Yup, that's good.

At several locations in the code base there was the equivalent of container.reserve(container.size() + 1), e.g. add_vertex(). Calling such an add_vertex() in a loop with an std::vector backing the vertex storage results in O(N²) time complexity instead of the expected (amortized) O(N). This is fixed by allowing the container to grow as expected.

I don't actually see why this is so, unless resize is doing something really pessimistic and reallocating the entire vector every time. Have you benchmarked this to confirm the change in performance?
Regardless, I prefer your implementation to the existing one, although please use auto where you can.
Can you use emplace_back() to create an empty vector instead of push_back(std::vector< cg_vertex >())?

AnthonyVH · 2025-01-16T10:38:52Z

I don't actually see why this is so, unless resize is doing something really pessimistic and reallocating the entire vector every time.

Everything I'm writing here assumes that add_vertex() is going to be called in a loop. Which I guess is a pretty reasonable assumption, how else is anyone going to construct a graph.

TLDR: Calling resize(prev_size + 1) forced reallocation on every call to add_vertex(), making it O(N). Letting the std::vector decide when to resize itself by calling push_back()/emplace_back() makes that O(1).

Now, for the long and windy explanation. Inside add_vertex() the old code did vertices.resize(prev_size + 1). From std::vector<>::resize():

Resizes the container to contain count elements, does nothing if count == size(). If the current size is greater than count, the container is reduced to its first count elements.

and

Complexity: Linear in the difference between the current size and count. Additional complexity possible due to reallocation if capacity is less than count.

I.e. in case the new size is larger than the old, resize() is going to increase the capacity of the underlying storage to exactly the requested new size. Crucially, it is not going to apply the growth factor it would when an "automatic" capacity increment happens (e.g. if you call push_back() on an std::vector for which size() == capacity()).

Note that it clearly states that the complexity is linear, not O(1). Growing an std::vector by pushing elements into it is amortized O(1). It's exactly because of reallocation that this is amortized. Assume an empty std::vector with capacity X, then the first X pushes are "free". On the X + 1th push_back() the underlying storage is full, so std::vector will call resize() internally, move/copy all existing X elements to new storage, and then push the X + 1th element on. So on average you pay O(X) / X = O(1) per push_back(). When this automatic resize happens the capacity will grow by a certain factor (1.5 for MSVC/libc++, and 2 for libstdc++ AFAIK), which ensures the amortized O(1) complexity for element insertion.

Manually calling resize()/reserve() in a loop is always a bad idea, because it ensures the automatic resizing is side-stepped. Now every one of those calls is going to reallocate all of the existing elements in the std::vector, because the capacity was only large enough to hold prev_size elements. Meaning every add_vector() calls takes O(N) time, whereas normally resize() would only be called every N add_vector() calls, meaning it could be O(1).

Have you benchmarked this to confirm the change in performance?

I did not see existing performance benchmarks, so I didn't try to implement one. However, the behavior of calling resize() is well defined, so for sure the complexity of add_vertex() goes from O(N) to O(1) with my change.

Here is a quick benchmark I threw together illustrating the difference: quick-bench. Note that the difference is only 10% here, most likely because there's not that much data to copy around.

I tried creating another benchmark where each entry in the vector is larger (which would be the case when e.g. a vertex has bundled properties that take up a reasonable amount of space). Unfortunately the quick-bench.com website kept timing out on that. I expect the difference to be much more pronounced in this case. If you'd like to try yourself, here's the code in Compiler Explorer (you can hit "Quick bench" in the top left there).

Edit: I realize the benchmarks show only a small speed difference. I guess the compiler somehow sees through the whole thing? I know of instances in industrial code where this exact same change caused a massive speed-up.

AnthonyVH · 2025-01-16T10:45:51Z

Here's another snippet from cppreference.com that describes the issue with the resize() call (which will call reserve() under the hood). From std::vector<>::reserve():

Correctly using reserve() can prevent unnecessary reallocations, but inappropriate uses of reserve() (for instance, calling it before every push_back() call) may actually increase the number of reallocations (by causing the capacity to grow linearly rather than exponentially) and result in increased computational complexity and decreased performance. For example, a function that receives an arbitrary vector by reference and appends elements to it should usually not call reserve() on the vector, since it does not know of the vector's usage characteristics.

When inserting a range, the range version of insert() is generally preferable as it preserves the correct capacity growth behavior, unlike reserve() followed by a series of push_back()s.

jeremy-murphy · 2025-01-22T09:06:08Z

Ohh, I get it, sorry, I just completely forgot that push_back will amortize the reallocations.

jeremy-murphy · 2025-01-22T09:14:19Z

Thanks for a great improvement to the efficiency!

I know the code change was very small, but in future please don't combine unrelated changes into one PR.

AnthonyVH · 2025-01-22T09:20:22Z

Thanks a lot for the quick merge! I'll try to better pay attention to the PR hygiene next time 😁.

jeremy-murphy · 2025-03-24T23:30:18Z

Btw, whatever differences you can benchmark, I can put in the release notes. At the moment, I think the only thing I can say, and correct me if I get it wrong, is "A 10% improvement to graph construction when using adjacency_list with vector-based storage."
Do I need to be more specific about exactly where/what the vector is used for? I mean, edge list vs vertex list?

AnthonyVH · 2025-03-25T16:19:23Z

I've tried creating a benchmark to show the result before and after my changes, but am finding it exceedingly difficult to trick the compiler. So unfortunately for now, I can't present a meaningful result (i.e. a plot clearly showing O(N) vs O(N^2) add_vertex behavior). I'll have another go later this week.

The fix was only for the vertex list. (I did just see that the same resize() call exists for add_edge() in detail/adjacency_list.hpp).

jeremy-murphy · 2025-03-25T22:33:26Z

No worries, thanks for trying. It's hard to outsmart the compiler. 😂

AnthonyVH added 2 commits January 16, 2025 11:49

Allow getting vertex property from const graph object

9096f61

Fix vector insertion complexity from O(N) to O(1)

15b587a

jeremy-murphy merged commit a372a07 into boostorg:develop Jan 22, 2025
22 checks passed

AnthonyVH deleted the anthonyvh/fix_reserve_performance branch January 22, 2025 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance issues for vector-based storage#413

Fix performance issues for vector-based storage#413
jeremy-murphy merged 2 commits intoboostorg:developfrom
AnthonyVH:anthonyvh/fix_reserve_performance

AnthonyVH commented Jan 15, 2025

Uh oh!

jeremy-murphy commented Jan 16, 2025 •

edited

Loading

Uh oh!

AnthonyVH commented Jan 16, 2025 •

edited

Loading

Uh oh!

AnthonyVH commented Jan 16, 2025

Uh oh!

jeremy-murphy commented Jan 22, 2025

Uh oh!

Uh oh!

jeremy-murphy commented Jan 22, 2025

Uh oh!

AnthonyVH commented Jan 22, 2025

Uh oh!

jeremy-murphy commented Mar 24, 2025

Uh oh!

AnthonyVH commented Mar 25, 2025

Uh oh!

jeremy-murphy commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AnthonyVH commented Jan 15, 2025

Uh oh!

jeremy-murphy commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AnthonyVH commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AnthonyVH commented Jan 16, 2025

Uh oh!

jeremy-murphy commented Jan 22, 2025

Uh oh!

Uh oh!

jeremy-murphy commented Jan 22, 2025

Uh oh!

AnthonyVH commented Jan 22, 2025

Uh oh!

jeremy-murphy commented Mar 24, 2025

Uh oh!

AnthonyVH commented Mar 25, 2025

Uh oh!

jeremy-murphy commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeremy-murphy commented Jan 16, 2025 •

edited

Loading

AnthonyVH commented Jan 16, 2025 •

edited

Loading