-
Notifications
You must be signed in to change notification settings - Fork 95
Add pairwise intersection between segments with sweep-line algorithm #1251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Benchmark Results (Julia v1)Time benchmarks
Memory benchmarks
|
Benchmark Results (Julia vlts)Time benchmarks
Memory benchmarks
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1251 +/- ##
=======================================
Coverage 87.86% 87.87%
=======================================
Files 196 197 +1
Lines 6181 6227 +46
=======================================
+ Hits 5431 5472 +41
- Misses 750 755 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @souma4! First round of review attached. This one should be much easier to get ready for a merge.
…s namespace open for a 2D sweep (if needed) in the future. Also fixed overlaps logic to properly break early
src/utils/intersect.jl
Outdated
| struct SweepLineInterval{T<:Number} | ||
| start::T | ||
| stop::T | ||
| segment::Any | ||
| index::Int | ||
| end | ||
|
|
||
| function overlaps(i₁::SweepLineInterval, i₂::SweepLineInterval) | ||
| i₁.start ≤ i₂.start && i₁.stop ≥ i₂.start | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this data structure?
The new algorithm seems very simple:
For each segment, do
- Compute start and save in a vector
starts - Compute stop and save in a vector
stops
Sort vectors segs, starts and stops in terms of sortperm(starts)
For each pair of segments, do
- Fast check with
startsandstops - If check is true, call expensive intersection
- Otherwise, advance the loop
If we simply implement this algorithm with raw vectors segs, starts and stops, it will be probably be more readable and type stable.
Also, the function sweep1d can probably be merged into the main pairwiseintersect function after that. The only part of the code that deserves encapsulation is the kernel of the loop that uses starts, stops for a fast check and only then computes intersections with segs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, if you feel that the readability with a single sort is superior, then use a simple vector of tuples preproc = [(start, stop, seg), ...] and sort(preproc, by=first).
When you introduce a data structure like SweepLineInterval and use its fields in a separate function (e.g., by=x->x.start), you are accessing internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really don't. The main idea was to have everything tied to the same object, but this way works well. It's actually significantly faster because it is effectively a struct of arrays rather than array of structs. It's now the fastest test I've run thus far.
… to vectors of indices because sets are expensive to initialized
src/utils/intersect.jl
Outdated
| oldindices = collect(1:n)[sortedindices] | ||
|
|
||
| # primary sweepline algorithm | ||
| 𝐺 = Dict{Point,Vector{Int}}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type is not concrete. Please use typeof if necessary. Is Dict the best data structure for this loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doh, you're right, Point isn't concrete. Easy fix. that simple fix improved the speed 2x 😮
I argue that Dict is the best data structure largely because what we want is a set of unique points and a set of corresponding segment indices for each point. In the case of more than two lines intersecting at a point, the dictionary allows us to easily check if that point already exists, then we can just append new indices to the indices.
Vector isfeasible, but they are a bit messier to handle with little to no speed improvement. That's largely because if we want unique keys in a Vector, we still have to find which indices are duplicated (a large batch membership equality check) then append segment indices as needed.
Set wouldn't work because the order of the set may be different from the segment indices, thus losing which indices belong to which intersection point.
Now, a true "graph" would probably be optimal, but adds the complexity of implementing said system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a check on if we could derive a graph from this output easily, and yes we can. We can build adjacency and segment-points (reverse of current point-segments) graphs in 50% of the time as creating the line segment intersections (in a dense case).
src/utils/intersect.jl
Outdated
| geometric intersections](https://ieeexplore.ieee.org/document/1675432) | ||
| """ | ||
| function pairwiseintersect(segments; digits=_digits(segments)) | ||
| P = typeof(vertices(first(segments))[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please define P as a function of segs as it is the actual variable used in the algorithm below.
src/utils/intersect.jl
Outdated
| function _checkintersection(i, k, starts, stops, segs) | ||
| overlap = _overlaps(starts[i], stops[i], starts[k]) | ||
| overlap || return :break | ||
| intersection(Segment(segs[i]), Segment(segs[k])) do 𝑖 | ||
| t = type(𝑖) | ||
| (t === Crossing || t === EdgeTouching) ? get(𝑖) : :continue | ||
| end | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this logic to inside the main loop? I think the code is simple enough to be pasted there without the :break and :continue symbols as return type.
src/utils/intersect.jl
Outdated
| segs = segs[inds] | ||
| # keep track of original indices | ||
| n = length(segs) | ||
| oldindices = (1:n)[inds] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this equal to inds? How could we sort 1:n according to inds and not get inds as the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually is equivalent 🤦 . I think I had it in my head that sortperm wasn't sortperm, the indices for the sorted order, which is what is used for old indices. thanks as always for catching these.
src/utils/intersect.jl
Outdated
| pairwiseintersect(segments; [digits]) | ||
| Compute pairwise intersections between n `segments` | ||
| with `digits` precision in O(n⋅log(n+k)) time using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to define k in the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There we go! 🎉
It is a good exercise to compare the first commit of this PR with the final version to gain experience with idiomatic Julia/Meshes.jl code. The simpler, the better 🥇
Upon reviewing the performance of comparative implementations, I decided to move from the very complicated and numerically unstable implementation in #1168 to a simpler interval sweepline (ie, we iterate over a sorted sequence by the furthest left x and check intersections if two segments have overlapping intervals).
This is equivalent to the Bentley-Ottmann algorithm in sparse cases and is O(N^2) in dense cases. However, in dense cases, Bentley-Ottmann computes many more intersections (to traverse the Binary search tree). These roughly balance out. In unofficial tests against a dense vector of segments, this implementation was the second fastest, only slower than crate::geo.
Profiling shows most time is spent exclusively in runtime and GC dispatch within the
Intersection(...) doblock, so improving efficiency there would be valuable.