Introduce benchmark CI and Python binding test suite by HudsonGraeme · Pull Request #45 · inference-labs-inc/btlightning

HudsonGraeme · 2026-02-26T02:06:42Z

Summary

Adds a label-gated (bench) benchmark CI workflow that builds lightning-bench for both the PR and main branches, runs them back-to-back on the same runner, and posts a markdown comparison comment with percentage changes on the PR
Introduces a pytest suite covering the full btlightning Python API surface (client, server, roundtrip, streaming, type serialization, error paths, multi-handler routing) with a corresponding CI job using maturin develop --release
Scopes .gitignore test file exclusions to root/benchmarks directories instead of globally ignoring test_*.py

Test plan

Verify bench.yml YAML parses correctly and job skips on PRs without the bench label
Add bench label to a PR and confirm benchmark comparison comment is posted
Verify python-test CI job passes: maturin develop --release && pytest tests/ -v --timeout=30
Run tests locally: cd crates/btlightning-py && maturin develop --release && pytest tests/ -v --timeout=30

Summary by CodeRabbit

Tests
- Added comprehensive Python tests for client/server lifecycle, echo roundtrips (large payloads, streaming), multi-handler concurrency, error handling, data type roundtrips, and connection stats.
- Introduced shared pytest fixtures for in-process server/client wiring and ephemeral port allocation.
Chores
- Added a PR benchmark comparison workflow that posts automated benchmark summaries.
- Extended CI to run Python tests; updated test discovery, timeouts, and ignore patterns.

coderabbitai · 2026-02-26T02:06:59Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8824bda and f97b0c3.

📒 Files selected for processing (1)

crates/btlightning-py/tests/test_errors.py

🚧 Files skipped from review as they are similar to previous changes (1)

crates/btlightning-py/tests/test_errors.py

Walkthrough

Adds a GitHub Actions benchmark workflow, a Python test CI job, pytest configuration and fixtures, and a suite of new pytest modules exercising client/server behavior, error handling, streaming, roundtrips, and data-type roundtrips for the btlightning-py crate.

Changes

Cohort / File(s)	Summary
Workflows `\.github/workflows/bench.yml`, `\.github/workflows/ci.yml`	Adds a "Benchmarks" workflow that builds PR and main benchmarks, runs both, captures JSON results, and posts a comparison comment (via github-script); adds a `python-test` CI job and marks `subtensor` as continue-on-error.
VCS & Pyproject `\.gitignore`, `crates/btlightning-py/pyproject.toml`	Adjusts test ignore patterns to include subdirectories and benchmarks; adds pytest optional-deps and `[tool.pytest.ini_options]` (tests path and 30s timeout).
Test Fixtures `crates/btlightning-py/tests/conftest.py`	Introduces constants and fixtures: `free_port`, `echo_server` (LightningServer with echo handler running in daemon thread), and `client_and_axon` (client connected to server), with setup/teardown.
Client Tests `crates/btlightning-py/tests/test_client.py`	Adds tests for Lightning client construction (defaults/custom), validator keypair, Python signer, and basic connection stats.
Server Tests `crates/btlightning-py/tests/test_server.py`	Adds tests for LightningServer construction (defaults/custom), start/stop lifecycle, and connection stats reporting.
Error & Validation Tests `crates/btlightning-py/tests/test_errors.py`	Adds tests validating missing fields (synapse_type, hotkey, ip, port), signer-required behavior, and invalid timeout rejection.
Multi-handler & Streaming `crates/btlightning-py/tests/test_multi_handler.py`, `crates/btlightning-py/tests/test_streaming.py`	Adds tests that verify multiple synapse handlers concurrently and a streaming handler that yields chunks, asserting correct delivery.
Roundtrip & Types `crates/btlightning-py/tests/test_roundtrip.py`, `crates/btlightning-py/tests/test_types.py`	Adds echo roundtrip tests (including large 100KB payloads, timeout, sequential queries) and parametrized type roundtrip tests for None, booleans, numbers, strings, bytes, lists, and dicts.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant GH as GitHub Actions
participant Runner as Action Runner
participant PR_Build as PR Benchmark Build
participant Main_Build as Main Benchmark Build
participant Bench as Benchmark Executor
participant API as GitHub API (comment)
GH->>Runner: workflow triggered (label or PR sync)
Runner->>PR_Build: checkout PR, setup Rust, build benchmark
PR_Build-->>Runner: PR binary copied
Runner->>Main_Build: checkout origin/main, setup, build (allow-fail)
alt main build succeeds
Main_Build-->>Runner: main binary copied
else main build fails
Runner-->>Runner: proceed without main results
end
Runner->>Bench: run PR benchmark -> produce JSON
Runner->>Bench: run main benchmark (if available) -> produce JSON
Bench-->>Runner: JSON results
Runner->>API: post or update PR comment with comparison table

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through tests with nimble feet,

Echoes answered, payloads met and beat,
Servers hummed and streams unfurled,
Benchmarks chimed across the world,
A rabbit cheers for passing suite!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely summarizes the main changes: introducing benchmark CI workflow and a Python test suite for bindings.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch introduce/benchmark-ci-comparison

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-26T02:13:14Z

Benchmark Results

Connection Setup (ms)

Percentile	PR	main	change
p50	2.00	2.03	(-1.4%)
p95	3.32	3.68	(-9.9%)
p99	3.62	4.44	(-18.5% +++)

Latency (ms)

Size	Percentile	PR	main	change
256B	p50	0.19	0.20	(-5.5%)
256B	p95	0.24	0.26	(-7.6%)
256B	p99	0.26	0.30	(-13.2% +++)
1KB	p50	0.19	0.21	(-5.2%)
1KB	p95	0.24	0.26	(-5.1%)
1KB	p99	0.26	0.28	(-5.3%)
10KB	p50	0.24	0.24	(+0.6%)
10KB	p95	0.30	0.29	(+4.1%)
10KB	p99	0.33	0.31	(+5.6%)
100KB	p50	0.81	0.76	(+5.5%)
100KB	p95	0.89	0.85	(+4.9%)
100KB	p99	0.93	0.90	(+3.4%)
1MB	p50	5.99	5.76	(+3.9%)
1MB	p95	6.60	6.34	(+4.2%)
1MB	p99	6.79	6.65	(+2.1%)

Throughput (req/s)

Size	PR	main	change
256B	20390.39	20414.96	(-0.1%)
1KB	19036.38	19350.43	(-1.6%)
10KB	10943.94	11193.59	(-2.2%)
100KB	2103.15	2141.78	(-1.8%)
1MB	242.79	244.25	(-0.6%)

Wire Bytes

Size	PR	main	change
256B	284	284	(0.0%)
1KB	1052	1052	(0.0%)
10KB	10268	10268	(0.0%)
100KB	102430	102430	(0.0%)
1MB	1048606	1048606	(0.0%)

…olerance

HudsonGraeme · 2026-02-26T02:56:17Z

@coderabbitai review

coderabbitai · 2026-02-26T02:56:26Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/bench.yml:
- Around line 75-76: The ternary expression that sets the benchmark marker uses
a 15% threshold but should use 10% to match the stated tolerance—update both
occurrences of the literal 15 in the expression that compares pct (the ternary
using ? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '') : (pct < -15 ? ' !!!' :
pct > 15 ? ' +++' : '')) to 10 so the regression/improvement markers reflect a
±10% tolerance while preserving the existing sign and ordering logic.

In `@crates/btlightning-py/tests/conftest.py`:
- Around line 27-33: The fixture uses a brittle fixed sleep and doesn't join the
serve thread; change it to wait for the server to be ready (poll a ready
predicate or attempt a short connect to free_port with a timeout loop) instead
of time.sleep(0.05) before yielding, and after calling server.stop() join the
thread (t.join(timeout)) to ensure the serve_forever thread exits and avoid
leakage; update references around server.start(), threading.Thread(...
target=server.serve_forever) / t.start(), the yield server, free_port, and
server.stop() to implement the readiness loop and t.join().

In `@crates/btlightning-py/tests/test_errors.py`:
- Around line 35-36: The test currently uses a blind pytest.raises(Exception)
for the query_axon error-path; replace that with the concrete exception classes
raised by the bindings (e.g., the specific Python error type constructed in the
Rust wrapper) for the "missing signer" and "invalid timeout" cases. Locate where
query_axon maps errors to Python exceptions (search for query_axon and the
PyErr/new_err/Py*Error creation in the bindings), then change the two
pytest.raises(Exception) calls in the test to
pytest.raises(<ConcreteExceptionClass>) using the exact exception types you find
and ensure the raised message assertion (if any) matches the expected text.

In `@crates/btlightning-py/tests/test_multi_handler.py`:
- Around line 22-41: The test is flaky because it relies on time.sleep and only
cleans up at the end; make startup/teardown deterministic and always run cleanup
by wrapping the setup and assertions in a try/finally so client.close(),
server.stop() and thread.join() are always called, and replace the fixed
time.sleep with a deterministic sync (e.g., wait on a server-started event or a
new server.wait_until_ready() method) after server.start() before starting
requests; update references in this test to use server.start(), the
serve_forever thread, client.initialize_connections(), client.query_axon(), and
in the finally block call client.close(), server.stop() and t.join() to avoid
leaked resources.

In `@crates/btlightning-py/tests/test_server.py`:
- Around line 35-48: Wrap the test setup and assertions for LightningServer in a
try/finally so the server teardown always runs: after creating server =
LightningServer(...), starting it (server.start()) and launching the
serve_forever thread (t = threading.Thread(...); t.start()), put the assertions
in the try block and call server.stop() in the finally block (and join the
thread t if needed) to ensure the server is stopped even if an assertion fails;
apply the same try/finally pattern to the other test block covering lines 51-63.
- Around line 39-42: The test uses a fixed time.sleep(0.05) after starting the
thread running server.serve_forever which is flaky; replace the sleep with an
explicit readiness check (either wait on a threading.Event the server sets when
ready or poll-connect to the server port) so the test only proceeds once the
server is actually accepting connections. Locate the thread start (t =
threading.Thread(target=server.serve_forever, daemon=True); t.start()) and
remove the time.sleep calls, then add a short loop that attempts to connect (or
waits on a server-ready Event exposed by the server) with a small timeout and
fail the test if readiness isn't achieved within a reasonable total timeout.

In `@crates/btlightning-py/tests/test_streaming.py`:
- Around line 28-38: Wrap the test logic that uses the Lightning instance and
server in a try/finally: after creating client = Lightning(...), calling
set_validator_keypair, initialize_connections and obtaining stream via
query_axon_stream, run the assertions inside the try block and move
client.close() and server.stop() into the finally block so they always run;
ensure any resources created by Lightning (client) and the test server are
cleaned up even if assertions fail.
- Around line 23-27: Replace the brittle fixed sleep after
server.start()/t.start() with a deterministic readiness poll: after calling
server.start() and starting the serve_forever thread, loop with a short sleep
(e.g. 5-50ms) up to a timeout and attempt to connect to the server (or check a
server-provided ready flag) until the connection succeeds, then proceed; replace
the time.sleep(0.05) with this polling loop (use server.serve_forever,
server.start, and the server's listen address/socket to probe) so the test waits
deterministically for the server to be ready instead of relying on a fixed
delay.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c1d17c and bb02ad9.

📒 Files selected for processing (12)

.github/workflows/bench.yml
.github/workflows/ci.yml
.gitignore
crates/btlightning-py/pyproject.toml
crates/btlightning-py/tests/conftest.py
crates/btlightning-py/tests/test_client.py
crates/btlightning-py/tests/test_errors.py
crates/btlightning-py/tests/test_multi_handler.py
crates/btlightning-py/tests/test_roundtrip.py
crates/btlightning-py/tests/test_server.py
crates/btlightning-py/tests/test_streaming.py
crates/btlightning-py/tests/test_types.py

coderabbitai · 2026-02-26T03:00:59Z

.github/workflows/bench.yml

+                ? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '')
+                : (pct < -15 ? ' !!!' : pct > 15 ? ' +++' : '');


⚠️ Potential issue | 🟠 Major

Regression/improvement threshold appears inconsistent with the stated 10% tolerance.

The benchmark marker logic is still using 15%, so comment annotations won’t match the intended tolerance.

🛠️ Proposed fix

- ? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '') - : (pct < -15 ? ' !!!' : pct > 15 ? ' +++' : ''); + ? (pct > 10 ? ' !!!' : pct < -10 ? ' +++' : '') + : (pct < -10 ? ' !!!' : pct > 10 ? ' +++' : '');

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '')

: (pct < -15 ? ' !!!' : pct > 15 ? ' +++' : '');

? (pct > 10 ? ' !!!' : pct < -10 ? ' +++' : '')

: (pct < -10 ? ' !!!' : pct > 10 ? ' +++' : '');

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/bench.yml around lines 75 - 76, The ternary expression that sets the benchmark marker uses a 15% threshold but should use 10% to match the stated tolerance—update both occurrences of the literal 15 in the expression that compares pct (the ternary using ? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '') : (pct < -15 ? ' !!!' : pct > 15 ? ' +++' : '')) to 10 so the regression/improvement markers reflect a ±10% tolerance while preserving the existing sign and ordering logic.

coderabbitai · 2026-02-26T03:00:59Z

crates/btlightning-py/tests/conftest.py

+    server.start()
+    t = threading.Thread(target=server.serve_forever, daemon=True)
+    t.start()
+    time.sleep(0.05)
+    yield server, free_port
+    server.stop()
+


⚠️ Potential issue | 🟠 Major

Harden fixture lifecycle to reduce startup flakiness and thread leakage.

A fixed delay is brittle, and teardown should wait for the serve thread to exit after server.stop().

♻️ Proposed fixture hardening

def echo_server(free_port): @@ server.register_synapse_handler("echo", lambda data: data) server.start() t = threading.Thread(target=server.serve_forever, daemon=True) t.start() - time.sleep(0.05) + deadline = time.time() + 2.0 + while time.time() < deadline: + try: + with socket.create_connection(("127.0.0.1", free_port), timeout=0.1): + break + except OSError: + time.sleep(0.01) + else: + raise AssertionError("echo server did not become ready in time") yield server, free_port server.stop() + t.join(timeout=1.0)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/btlightning-py/tests/conftest.py` around lines 27 - 33, The fixture uses a brittle fixed sleep and doesn't join the serve thread; change it to wait for the server to be ready (poll a ready predicate or attempt a short connect to free_port with a timeout loop) instead of time.sleep(0.05) before yielding, and after calling server.stop() join the thread (t.join(timeout)) to ensure the serve_forever thread exits and avoid leakage; update references around server.start(), threading.Thread(... target=server.serve_forever) / t.start(), the yield server, free_port, and server.stop() to implement the readiness loop and t.join().

crates/btlightning-py/tests/test_errors.py

crates/btlightning-py/tests/test_multi_handler.py

crates/btlightning-py/tests/test_server.py

coderabbitai · 2026-02-26T03:01:00Z

crates/btlightning-py/tests/test_server.py

+    t = threading.Thread(target=server.serve_forever, daemon=True)
+    t.start()
+    time.sleep(0.05)
+


⚠️ Potential issue | 🟠 Major

Avoid fixed sleeps for server readiness in lifecycle tests.

Line 41 and Line 57 rely on time.sleep(0.05), which is timing-sensitive and can intermittently fail in busy CI environments.

Suggested fix pattern

- time.sleep(0.05) + deadline = time.time() + 2.0 + while time.time() < deadline: + try: + with socket.create_connection(("127.0.0.1", port), timeout=0.05): + break + except OSError: + time.sleep(0.01) + else: + raise AssertionError("Server did not become ready in time")

Also applies to: 55-58

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/btlightning-py/tests/test_server.py` around lines 39 - 42, The test uses a fixed time.sleep(0.05) after starting the thread running server.serve_forever which is flaky; replace the sleep with an explicit readiness check (either wait on a threading.Event the server sets when ready or poll-connect to the server port) so the test only proceeds once the server is actually accepting connections. Locate the thread start (t = threading.Thread(target=server.serve_forever, daemon=True); t.start()) and remove the time.sleep calls, then add a short loop that attempts to connect (or waits on a server-ready Event exposed by the server) with a small timeout and fail the test if readiness isn't achieved within a reasonable total timeout.

coderabbitai · 2026-02-26T03:01:00Z

crates/btlightning-py/tests/test_streaming.py

+    server.start()
+    t = threading.Thread(target=server.serve_forever, daemon=True)
+    t.start()
+    time.sleep(0.05)
+


⚠️ Potential issue | 🟠 Major

Replace fixed startup sleep with deterministic readiness polling.

Line 26 uses a fixed time.sleep(0.05), which is flaky on shared CI runners and can produce intermittent failures.

Suggested fix

server.start() t = threading.Thread(target=server.serve_forever, daemon=True) t.start() - time.sleep(0.05) + deadline = time.time() + 2.0 + while time.time() < deadline: + try: + with socket.create_connection(("127.0.0.1", port), timeout=0.05): + break + except OSError: + time.sleep(0.01) + else: + raise AssertionError("Server did not become ready in time")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/btlightning-py/tests/test_streaming.py` around lines 23 - 27, Replace the brittle fixed sleep after server.start()/t.start() with a deterministic readiness poll: after calling server.start() and starting the serve_forever thread, loop with a short sleep (e.g. 5-50ms) up to a timeout and attempt to connect to the server (or check a server-provided ready flag) until the connection succeeds, then proceed; replace the time.sleep(0.05) with this polling loop (use server.serve_forever, server.start, and the server's listen address/socket to probe) so the test waits deterministically for the server to be ready instead of relying on a fixed delay.

crates/btlightning-py/tests/test_streaming.py

…est suite

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

crates/btlightning-py/tests/test_streaming.py (1)

26-26: ⚠️ Potential issue | 🟠 Major

Use readiness polling instead of a fixed sleep before streaming assertions.

Line 26 uses a fixed delay that can intermittently fail under CI load.

Suggested fix

-    time.sleep(0.05)
+    deadline = time.time() + 2.0
+    while time.time() < deadline:
+        try:
+            with socket.create_connection(("127.0.0.1", port), timeout=0.05):
+                break
+        except OSError:
+            time.sleep(0.01)
+    else:
+        raise AssertionError("Server did not become ready in time")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/btlightning-py/tests/test_streaming.py` at line 26, Replace the
brittle fixed delay time.sleep(0.05) in tests/test_streaming.py with a
readiness-polling loop: identify the readiness condition used by the test (e.g.,
a stream.ready() / connection.is_ready() method, presence of messages in the
stream/queue, or a helper like wait_for_event), poll that condition with a short
sleep interval until a timeout (e.g., 1–5s) and only then perform the streaming
assertions; update the test to fail with a clear timeout message if readiness is
not observed. Ensure you replace the exact time.sleep(0.05) call so the test
waits deterministically for the stream to be ready rather than relying on a
fixed delay.

crates/btlightning-py/tests/test_server.py (1)

41-41: ⚠️ Potential issue | 🟠 Major

Replace fixed startup sleeps with deterministic readiness polling.

Line 41 and Line 59 still rely on a fixed time.sleep(0.05), which is timing-sensitive and can flake on slower CI runners.

Suggested fix

+def _wait_until_listening(host: str, port: int, timeout_secs: float = 2.0) -> None:
+    deadline = time.time() + timeout_secs
+    while time.time() < deadline:
+        try:
+            with socket.create_connection((host, port), timeout=0.05):
+                return
+        except OSError:
+            time.sleep(0.01)
+    raise AssertionError(f"Server {host}:{port} did not become ready in time")
+
@@
-    time.sleep(0.05)
+    _wait_until_listening("127.0.0.1", port)
@@
-    time.sleep(0.05)
+    _wait_until_listening("127.0.0.1", port)

Also applies to: 59-59

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/btlightning-py/tests/test_server.py` at line 41, The test uses fixed
time.sleep(0.05) in tests/test_server.py which is flaky; replace both
occurrences with deterministic readiness polling: remove time.sleep(0.05) and
implement a short loop that repeatedly attempts to contact the test server
(e.g., TCP connect to the server port or an HTTP/health endpoint) with a small
backoff and overall timeout (e.g., 2–5s), returning as soon as the connection
succeeds, and failing the test if the timeout elapses; update the two places
where time.sleep is called to use this polling helper (e.g.,
wait_for_server_ready or inline loop) so startup is robust on slow CI.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/btlightning-py/tests/test_errors.py`:
- Around line 32-37: The test test_query_without_signer should guarantee
Lightning cleanup: wrap the client creation and the pytest.raises block in a
try/finally so that client.close() is always called even if client.query_axon
doesn't raise; specifically, create the Lightning instance (client =
Lightning(...)), run the with pytest.raises(ConnectionError, match="endpoint not
initialized"): client.query_axon(...) inside try, and call client.close() in
finally to ensure cleanup rather than relying on __del__.

---

Duplicate comments:
In `@crates/btlightning-py/tests/test_server.py`:
- Line 41: The test uses fixed time.sleep(0.05) in tests/test_server.py which is
flaky; replace both occurrences with deterministic readiness polling: remove
time.sleep(0.05) and implement a short loop that repeatedly attempts to contact
the test server (e.g., TCP connect to the server port or an HTTP/health
endpoint) with a small backoff and overall timeout (e.g., 2–5s), returning as
soon as the connection succeeds, and failing the test if the timeout elapses;
update the two places where time.sleep is called to use this polling helper
(e.g., wait_for_server_ready or inline loop) so startup is robust on slow CI.

In `@crates/btlightning-py/tests/test_streaming.py`:
- Line 26: Replace the brittle fixed delay time.sleep(0.05) in
tests/test_streaming.py with a readiness-polling loop: identify the readiness
condition used by the test (e.g., a stream.ready() / connection.is_ready()
method, presence of messages in the stream/queue, or a helper like
wait_for_event), poll that condition with a short sleep interval until a timeout
(e.g., 1–5s) and only then perform the streaming assertions; update the test to
fail with a clear timeout message if readiness is not observed. Ensure you
replace the exact time.sleep(0.05) call so the test waits deterministically for
the stream to be ready rather than relying on a fixed delay.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb02ad9 and 8824bda.

📒 Files selected for processing (5)

crates/btlightning-py/tests/conftest.py
crates/btlightning-py/tests/test_errors.py
crates/btlightning-py/tests/test_multi_handler.py
crates/btlightning-py/tests/test_server.py
crates/btlightning-py/tests/test_streaming.py

🚧 Files skipped from review as they are similar to previous changes (2)

crates/btlightning-py/tests/conftest.py
crates/btlightning-py/tests/test_multi_handler.py

crates/btlightning-py/tests/test_errors.py

Introduce label-gated benchmark CI workflow for PR comparison

6ade8e4

HudsonGraeme force-pushed the introduce/benchmark-ci-comparison branch from dbd1fe0 to 2e0b5d8 Compare February 26, 2026 02:08

HudsonGraeme added the bench label Feb 26, 2026

HudsonGraeme force-pushed the introduce/benchmark-ci-comparison branch 2 times, most recently from 81925c9 to 29d2893 Compare February 26, 2026 02:11

Introduce Python binding test suite and CI job

2e4d482

HudsonGraeme force-pushed the introduce/benchmark-ci-comparison branch from 29d2893 to 2e4d482 Compare February 26, 2026 02:12

Raise benchmark regression threshold to 10% for shared runner noise t…

e37aed0

…olerance

HudsonGraeme removed the bench label Feb 26, 2026

Resolve config validation failures in custom config constructor tests

bb02ad9

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

Resolve resource leaks and imprecise exception assertions in Python t…

8824bda

…est suite

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

crates/btlightning-py/tests/test_errors.py Outdated Show resolved Hide resolved

Resolve missing cleanup guarantee in query_without_signer error test

f97b0c3

HudsonGraeme merged commit d8bd9e6 into main Feb 26, 2026
16 checks passed

HudsonGraeme deleted the introduce/benchmark-ci-comparison branch February 26, 2026 04:16

		? (pct > 15 ? ' !!!' : pct < -15 ? ' +++' : '')
		: (pct < -15 ? ' !!!' : pct > 15 ? ' +++' : '');

Conversation

HudsonGraeme commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Connection Setup (ms)

Latency (ms)

Throughput (req/s)

Wire Bytes

Uh oh!

HudsonGraeme commented Feb 26, 2026

Uh oh!

coderabbitai bot commented Feb 26, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HudsonGraeme commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

github-actions bot commented Feb 26, 2026 •

edited

Loading