Skip to content

NOJIRA-270: add retry macro; add it to a failing test#318

Merged
ktakashi merged 5 commits intoktakashi:masterfrom
fmv1992:nojira_270
Nov 25, 2025
Merged

NOJIRA-270: add retry macro; add it to a failing test#318
ktakashi merged 5 commits intoktakashi:masterfrom
fmv1992:nojira_270

Conversation

@fmv1992
Copy link
Contributor

@fmv1992 fmv1992 commented Nov 22, 2025

No description provided.

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 22, 2025

@ktakashi , this adds the macro and adds it to the failing test.

My next steps would be to improve the docs on how to get up and running with tests. I need confidence that the tests are green so that I'm positive when I break things.

Thanks for your feedback.

Cross linking: #313 .

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 22, 2025

I need to change how retry works; upon failure it should re-raise the last exception instead of returning a unique object (the unique object would signal failure, since it would be out of the result set of the function).

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 23, 2025

@ktakashi , done, not retry simply runs the thunk on the last attempt (thus raising if that happens).

@ktakashi
Copy link
Owner

The retry doesn't solve the test failure, but only retries in case of exception is thrown. Is this the intention of the change? Also, the current implementation of retry doesn't return the unique-value in case of non-continuable condition is raised.

Here is the simple case to proof my point

(import (rnrs))

(with-exception-handler
 (lambda (e) #t) ;; this is not allowed
 (lambda () (error 'test "msg")))
;; the above expression ends up an error

To avoid this, you need to use guard instead of with-exception-handler.

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 23, 2025

The retry doesn't solve the test failure, but only retries in case of exception is thrown. Is this the intention of the change?

Yes. I know the ideal scenario would be to tackle the root cause, but my suggestion (it's up for your approval, of course) would be to have the CI/CD stable and then become more stringent with failing tests, perhaps explicitly not allowing them to non-deterministically fail. It's a tough call; big test suites are hardly deterministic, specially supporting so many systems, and involving networking, etc.

To avoid this, you need to use guard instead of with-exception-handler.

Indeed, I did just that. I'm a novice in Scheme, so I have not internalized the idea of having continuable and non continuable exceptions.

Now I'm using a guard.

The next step would be to add this to every test that's non deterministically failing, and consider the tests stable afterwards (with a possibly "policy" change on tests failing).

What do you think? The ultimate goal is to a have a test suite that coincides with what is acceptable or not.

Thanks for your review.

@ktakashi
Copy link
Owner

If you know which procedure is raising an error during the test, then it'd be better to do it like this then

;; rewrite retry macro syntax like this
(define-syntax retry
  (syntac-rules ()
    ((_ n-retry expr ...)
      (let ((procedure (lambda () expr ...)))
        // retry implementation
        ...))))

(let ((r (retry 5 (${target-procedure} ...))))
  // proceed the test
  (test-assert ...)
  ...)

This way, the test failure won't be double (or more).

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 24, 2025

This is the specific error I'm targeting, unfortunately I did not save CircleCI's URL (but it's there somewhere):

-;
-; %%%% Starting test Simple server framework
-; Unhandled exception
-;   Condition components:
-;   1. &socket #<socket client:632436 "packer-68ee8680-6a46-1a36-0dda-6be79b5bc172(127.0.0.1):12345" 0x42bd2f8>
-;   2. &who socket-recv
-;   3. &message An existing connection was forcibly closed by the remote host.
-;   4. &irritants (10054)
-;   5. &stack-trace
-;
-; stack trace:
-;   [1] %socket-recv
-;   [2] socket-recv
-;     src: (%socket-recv sock len flags)
-;     "C:\\Users\\circleci\\project\\ext\\socket\\sagittarius\\socket.scm":220
-;   [3] #f
-;     src: (socket-recv sock 5)
-;     "C:/Users/circleci/project/test/tests/net/server.scm":188
-;   [4] eval
-;   [5] main-loop
-;   [6] dynamic-wind
-;   [7] #<identifier loop#(core base)>
-;     src: (proc (car l))
-;     "C:\\Users\\circleci\\project\\lib\\core\\inline.scm":96
-;   [8] dynamic-wind
-;   [9] dynamic-wind
-;   [10] eval
-;   [11] script

If I understood your suggestion, I don't think that if I retry at the function level, (socket-recv sock 5) this will succeed. I think the whole test needs to be re-run.

It's a good point that having the test fail more than once is a bit weird, but between it failing 1 time versus say 3 times I don't think it changes much.

I can add the retry at this level:

(let ((sock (make-client-socket "localhost" "12345")))

But I'm not sure it will actually do the trick here.

What do you think?

Thanks for your feedback.

@ktakashi
Copy link
Owner

It'd be better to then change the test like this

(guard (e ((socket-error? e) (test-assert "server socket closed" #t))
          (else (test-assert (condition-message e) #f)))
  (let ((bv (socket-recv sock 5)))
    (test-equal #vu8(1 2 3 4 5) bv)))

The fundamental issue of the test failure is race condition, caused by the procedure of detached-actor, then it's more or less expected behaviour of having an error. So, checking if the error is the expected one (in this case &socket and can be checked with socket-error?) is better and cleaner.

@fmv1992
Copy link
Contributor Author

fmv1992 commented Nov 25, 2025

Sounds good.

I tried to reproduce this on my Linux but it seems it has a different semantics on sockets closing compared to Windows (Linux seems to allow reading buffered data after the socket is closed).

Anyway, here's your suggested fix; I'll move on to other test fixes (I'm open to ideas).

Thanks for the feedback.

@ktakashi ktakashi merged commit 438ec91 into ktakashi:master Nov 25, 2025
@fmv1992 fmv1992 deleted the nojira_270 branch November 26, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants