Skip to content

Commit 6462ef3

Browse files
committed
Add notes on cartesianProduct performance
I came up with an implementation that is obviously big-O optimal, but that seems much slower in my limited tests. Add a note about that, and document the conjecture that the implementation we use is optimal too.
1 parent 5f6dad3 commit 6462ef3

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

Data/Set/Internal.hs

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1712,7 +1712,7 @@ powerSet :: Set a -> Set (Set a)
17121712
powerSet xs0 = insertMin empty (foldr' step Tip xs0) where
17131713
step x pxs = insertMin (singleton x) (insertMin x `mapMonotonic` pxs) `glue` pxs
17141714

1715-
-- | Calculate the Cartesian product of two sets.
1715+
-- | /O(m*n)/ (conjectured). Calculate the Cartesian product of two sets.
17161716
--
17171717
-- @
17181718
-- cartesianProduct xs ys = fromList $ liftA2 (,) (toList xs) (toList ys)
@@ -1727,10 +1727,21 @@ powerSet xs0 = insertMin empty (foldr' step Tip xs0) where
17271727
--
17281728
-- @since 0.5.11
17291729
cartesianProduct :: Set a -> Set b -> Set (a, b)
1730-
-- We don't know if this implementation (slightly modified from one
1731-
-- that Edward Kmett hacked together) is optimal. It would be interesting
1732-
-- to find out. TODO: try to get some clue about the big-O performance
1733-
-- so we can advise users.
1730+
-- I don't know for sure if this implementation (slightly modified from one
1731+
-- that Edward Kmett hacked together) is optimal. TODO: try to prove or
1732+
-- refute it.
1733+
--
1734+
-- We could definitely get big-O optimal (O(m * n)) in a rather simple way:
1735+
--
1736+
-- cartesianProduct _as Tip = Tip
1737+
-- cartesianProduct as bs = fromDistinctAscList
1738+
-- [(a,b) | a <- toList as, b <- toList bs]
1739+
--
1740+
-- Unfortunately, this is much slower in practice, at least when the sets are
1741+
-- constructed from ascending lists. I tried doing the same thing using a
1742+
-- known-length (perfect balancing) variant of fromDistinctAscList, but it
1743+
-- still didn't come close to the performance of Kmett's version in my very
1744+
-- informal tests.
17341745

17351746
-- When the second argument has at most one element, we can be a little
17361747
-- clever.

0 commit comments

Comments
 (0)