Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions test/manifold/deferred_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -287,21 +287,27 @@
(deftest test-finally
(let [target-d (d/deferred)
d (d/deferred)
fd (d/finally
d
(fn []
(d/success! target-d ::delivered)))]
fd (-> d
(d/finally
(fn []
(d/success! target-d ::delivered)))
;; to silence dropped error detection
(d/catch identity))]
(d/error! d (Exception.))
(is (= ::delivered (deref target-d 0 ::not-delivered)))))

(deftest test-alt
(is (#{1 2 3} @(d/alt 1 2 3)))
(is (= 2 @(d/alt (d/future (Thread/sleep 10) 1) 2)))
(is (= 2 @(d/alt (d/future (Thread/sleep 100) 1) 2)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the 10 making this test flaky?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Probably because there's a non-zero chance of the following sequence:

  1. deftest thread starts the future thread.
  2. deftest thread gets swapped out
  3. future thread sleeps, runs, and returns
  4. deftest thread resumes, and starts the alt logic checking its possible values in random order
  5. deftest thread checks the future results before "2", finds it's finished, and returns

It's unlikely, but not impossible. The longer the future thread sleeps, the less likely the alt logic is to be delayed.


Honestly, this test is not great. Using unit tests on probabilistic behaviors was always a recipe for flakiness.

One quick way to improve consistency is to set a seed during testing, so the random order used by alt/alt' is always the same.

It won't fix the underlying issue, though. As the sleep time approaches the slice time given threads, the odds of the "sleeping" thread finishing before alt checks any of its values, goes way up.

One possibility is to run it repeatedly, and set a threshold for the expected proportion that select 1 vs 2. Of course, that proportion will still be affected by hardware, so...

Copy link
Collaborator Author

@DerGuteMoritz DerGuteMoritz Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, @KingMob's hypothesis is also what I arrived at after having seen it fail like that occasionally. This happened while I was working on the patch. I repeatedly ran the test from the REPL so there was probably a bit more thread scheduling going on compared to running the test suite once with lein test.

I agree that this test isn't great. However, I just realized that we can actually make it fully deterministic - see ec01b4a!


(is (= 2 @(d/alt (d/future (Thread/sleep 10) (throw (Exception. "boom"))) 2)))
(let [ef (d/future (Thread/sleep 100) (throw (Exception. "boom 1")))]
;; to silence dropped error detection
(d/catch ef identity)
(is (= 2 @(d/alt ef 2))))

(is (thrown-with-msg? Exception #"boom"
@(d/alt (d/future (throw (Exception. "boom"))) (d/future (Thread/sleep 10)))))
@(d/alt (d/future (throw (Exception. "boom 2")))
(d/future (Thread/sleep 100)))))

(testing "uniformly distributed"
(let [results (atom {})
Expand Down