Skip to content

[Small Feature] Arrangement Caching Polyline Traits#5594

Open
sgiraudot wants to merge 45 commits intoCGAL:mainfrom
sgiraudot:Arrangement-Lightweight_polyline_traits-GF
Open

[Small Feature] Arrangement Caching Polyline Traits#5594
sgiraudot wants to merge 45 commits intoCGAL:mainfrom
sgiraudot:Arrangement-Lightweight_polyline_traits-GF

Conversation

@sgiraudot
Copy link
Copy Markdown
Contributor

@sgiraudot sgiraudot commented Apr 8, 2021

Rationale

The current implementation of Arr_polyline_traits_2 copies chunks of points in most major steps:

  • when making a curve X-monotone, all input points are copied into new containers
  • when splitting a polyline, again, all input points are copied into the new created polylines

Profiling shows that this copies are very costly, especially when dealing with large polylines. The principle of this PR is to introduce an alternative traits called Arr_caching_polyline_traits_2 that relies on pairs of iterators to perform shallow copies of chunks of points (with the addition of hanging extreme points that might not belong to the original range of points: points created by intersecting 2 polylines for example).

In practise, this implementation makes boolean operations run up to 2.8x faster on large data sets, and never seems to be counter-productive.

Summary of API changes

  • A new class Arr_caching_polyline_traits_2 is provided

License and copyright ownership

(no change)

CHANGES.md

TODO

Submission

Status

@sgiraudot
Copy link
Copy Markdown
Contributor Author

@efifogel this is just a draft of small feature: before going further (documenting and everything), I'd like to have your input.

This PR is based on #5094 as it follows the same logic of using X-monotone polylines to make boolean operations faster.

New traits

The current implementation of Arr_polyline_traits_2 copies chunks of points in most major steps:

  • when making a curve X-monotone, all input points are copied into new containers
  • when splitting a polyline, again, all input points are copied into the new created polylines

Profiling shows that this copies are very costly, especially when dealing with large polylines. The principle of this PR is to introduce an alternative traits called Arr_lightweight_polyline_traits_2 that relies on pairs of iterators to perform shallow copies of chunks of points (with the addition of hanging extreme points that might not belong to the original range of points: points created by intersecting 2 polylines for example).

In practise, this implementation makes boolean operations run up to 2.8x faster on large data sets, and never seems to be counter-productive.

Of course, compared to Arr_polyline_traits_2, this traits has two limitations:

  • the traits need to be templated by the range, which means the type of the traits will be different depending on whether your input is a CGAL::Polygon_2, a std::list<Point_2>, etc.
  • the range needs to exist in memory outside of the traits: you cannot generate curves "on-the-fly", you always need a valid reference to the support (like a CGAL::Polygon_2)

For these reasons, I don't think it should replace the existing Arr_polyline_traits_2 but just be an alternative traits.

I'm not quite sure about the name: I used Arr_lightweight_polyline_traits_2 as my main objective was indeed to make a traits that manipulates "lightweight" objects (pairs of iterators instead of deep copies), but you may have a better idea.

There might also be room for improvement in terms of implementation: I tried to avoid duplicating code as much as possible by making Arr_lightweight_polyline_traits_2 inherit from Arr_polycurve_basic_traits_2, but it may be possible to make it better than what I did. I had to slightly modify Arr_polycurve_basic_traits_2 to make the polyline type template (the core of my implementation of lightweight polylines is in Arr_geometry_traits/Lightweight_polyline.h).

Use in free functions

Considering it always performs better than Arr_polyline_traits_2 in my experiments, I also modified the behavior introduced by #5094 to use Arr_lightweight_polyline_traits_2 by default. The limitations are not a problem here, as we always have the same range type (CGAL::Polygon_2 – even for PWH for which we have one CGAL::Polygon_2 per outer boundary + holes) and the polygons do exist in memory outside of the boolean operations.

I did not update the test_general_polygon_constructions test that thus fails (as it uses the original traits). It will require quite a lot of changes as the API is quite different (and because of the fixed type of the traits which means there will be a different type for different cases).

Insertion of PWH in GPS

One unrelated change but which I still added with this branch is a change in the function General_polygon_set_2::insert(const Polygon_with_holes&): I noticed that this function was significantly slower than the function to insert a simple CGAL::Polygon_2. It seems that this comes from the fact insert(PWH) uses a sweep while insert(Polygon_2) just does one "locate" and directly updated the arrangement.

By slightly modifying insert(Polygon_2), we can use it to insert a hole (instead of marking the new face creates as in, we mark it as out if we insert a hole), and then just do insert(Polygon_2) with each hole of the PWH.

This makes the insertion of the PWH about 10x faster. This also gives me the intuition that the Surface Sweep in itself has a very big overhead…

For a reason I'm not sure I understand, the polygon validation algorithm fails with this new insert(PWH) version, so I added a boolean parameter to still make is possible to use the sweep.

Status

The feature is developed and functional.

All tests of the BSO2 testsuite compile and run without error (except for the test_general_polygon_constructions test I was mentioning earlier).

So far it's not documented.

Please let me know what's your opinion on this, if you see anything to change (naming, organisation, API, etc.) and how we can integrate it.

@efifogel
Copy link
Copy Markdown
Member

efifogel commented Apr 12, 2021 via email

@sgiraudot
Copy link
Copy Markdown
Contributor Author

First, I assume that using Handles, which in turn use reference counting
for points and curves, is still inefficient, because we copy many handles;
do you agree?

Indeed. Copying a range of n objects, however lightweight they are, is O(n), while copying a pair of iterators is O(1).

I must also add that CGAL Handles are not that lightweight: I did a test copying a large vector of EPECK points, and comparing that with copying pointers and copying reference wrappers. The results:

Copies = 2.63811
Pointers = 0.945099 (2.79136x faster)
Refs = 0.997299 (2.64525% faster)

Is there any advantage using a shared pointer for the range (and not
just a pointer---line 52)

Users have the ownership on the range, so they are responsible for the memory management of the range. Ideally, this pointer in particular would be a const ref, but it would make the object not default constructible (among other limitations), so using a pointer is easier.

We do have Arr_segment_traits_2 and Arr_non_caching_segment_traits_2.
How about Arr_caching_polyline_traits_2 ?

Sounds good! I'll use that.

  1. Where and how is the line of an Extreme_point (the second element) used?

Exactly the same as a line of an input point: this is similar to the line cache mechanism of the Arr_segment_traits_2.

The extreme points are only there to handle the fact that some points cannot come from the input range (points created by intersection), but that is an implementation detail of the lightweight polycurve: from an external point of view, there's no difference between an input point and an extreme points (both also embed the cached line support of the following segment).

Let's imagine you have a polyline only composed of input points:

Polyline A

p0   l0      p1   l1      p2   l2      p3   l3      p4  (l4 not used)
+------------+------------+------------+------------+

At some point, this polyline is split between p1 and p2, an extreme point pE (shared) is created:

Polyline B                    Polyline C

p0   l0      p1  l1  pE       pE lE  p2   l2      p3   l3      p4  (l4 not used)
+------------+-------+        +------+------------+------------+

(Thanks also for the other remarks I did not reply to, I'll integrate the changes.)

@efifogel
Copy link
Copy Markdown
Member

efifogel commented Apr 19, 2021 via email

@efifogel
Copy link
Copy Markdown
Member

efifogel commented Apr 20, 2021 via email

public:

typedef T* iterator;
typedef T* const_iterator;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing const? + non-const begin/end

@github-actions
Copy link
Copy Markdown

github-actions bot commented May 4, 2021

The documentation is built. It will be available, after a few minutes, here : https://cgal.github.io/5594/v1/Manual/index.html

@sgiraudot sgiraudot marked this pull request as ready for review May 4, 2021 07:04
@efifogel
Copy link
Copy Markdown
Member

efifogel commented May 11, 2021 via email

@sgiraudot
Copy link
Copy Markdown
Contributor Author

Alright, I managed to fix some tests: Split_2 was not well implemented as I never tested the cases where the split point was an existing point of the curve, it should be fixed now (the split test passes). Also, the default value for duplicate_first was set to true, which indeed led to problems as all tested curves were wrong compared to the expected values of the testsuite.

Now there are still tests that fail, but I think here we reach the limits of this caching mechanism:

  • overlaps are handled in a suboptimal way, by outputting segments one by one using only extreme points: the reason is that overlaps cannot be handled correctly with caching traits. Picture for example 2 overlapping polylines for which the vertices along the overlap alternate:
CV1 = x--------x--------------x----x------------------x-----x
CV2 = x---x-----------x----------------------x--------------x

OUT = x---x----x------x-------x----x---------x--------x-----x

In that case, tests would expect 1 output curve when computing the intersection, but this is impossible with caching traits as it would mean this curve would have to alternate between two separate ranges. Which is why I output segments one by one using extreme points only: the output is correct (in the sense that it does return overlapping X-monotone curves) but suboptimal. This should trigger any problem in practise, but tests that expect a specific unique polyline can never pass.

  • the merge test does not pass either, for a similar reason. I did not implement Are_mergeable_2 and Merge_2: in the case of caching polylines, 2 polylines would only be mergeable if they relied on the same range and share a common indexed point. But in that case, there's no reason why these polylines would be separated in the first place… Two polylines relying on separate range can never be merged even if they share a common geometric point (which is what is expected, from what I understand). I don't know in what cases these functors are used, but it seems to me caching traits simply cannot support them.

@efifogel
Copy link
Copy Markdown
Member

efifogel commented May 11, 2021 via email

@sgiraudot sgiraudot changed the title [Small Feature] Arrangement Lightweight Polyline Traits [Small Feature] Arrangement Caching Polyline Traits May 12, 2021
@sgiraudot
Copy link
Copy Markdown
Contributor Author

Alright, I tried to do a partial implementation of merge / are mergeable, but it appears it's not sufficient for the tests in the testsuite (there are cases were curves not from the same range are tested for merging), so I did as you suggest and disabled the tests.

I also fixed the missing precondition in Split_2, so the split tests in the assertions test pass.

The assertions tests still fails on other parts: from what I understand, it expects a precondition violation on merge tests but get an assertion violation. But I guess the merge test should not be run at all?

@efifogel
Copy link
Copy Markdown
Member

efifogel commented May 17, 2021 via email

@maxGimeno maxGimeno added this to the 5.4-beta milestone Jun 4, 2021
@sloriot sloriot modified the milestones: 5.4-beta, 5.5-beta1 Sep 23, 2021
@MaelRL MaelRL modified the milestones: 5.5-beta, 5.6-beta Mar 28, 2022
@MaelRL MaelRL modified the milestones: 5.6-beta, 5.7-beta Mar 23, 2023
@janetournois janetournois modified the milestones: 6.0-beta, 6.1-beta May 16, 2024
@MaelRL MaelRL modified the milestones: 6.1-beta, 6.2-beta Mar 17, 2025
@sloriot sloriot changed the base branch from master to main September 16, 2025 19:20
@MaelRL MaelRL modified the milestones: 6.2-beta, 6.3-beta Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants