|
| 1 | +# Case Study: Benchmark Failure Analysis - Issue #3 |
| 2 | + |
| 3 | +## Executive Summary |
| 4 | + |
| 5 | +The Neo4j vs Doublets benchmark suite was failing during the delete benchmark with error `NotExists(4000)`. Root cause analysis revealed that the `Transaction` wrapper implementation was a non-functional stub that always returned errors for delete operations, causing the benchmark to panic. |
| 6 | + |
| 7 | +## Timeline of Events |
| 8 | + |
| 9 | +| Time | Event | |
| 10 | +|------|-------| |
| 11 | +| 2025-12-22 12:22:51 | First failed run detected (run-20431789056) | |
| 12 | +| 2025-12-22 13:20:15 | Second failed run (run-20433102564) | |
| 13 | +| 2025-12-22 16:06:33 | Third failed run referenced in issue (run-20437258944) | |
| 14 | + |
| 15 | +All three failures exhibited the same error pattern: |
| 16 | +``` |
| 17 | +thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NotExists(4000)', benches/benchmarks/delete.rs:23:9 |
| 18 | +``` |
| 19 | + |
| 20 | +## Technical Analysis |
| 21 | + |
| 22 | +### Error Location |
| 23 | + |
| 24 | +The panic occurred at `benches/benchmarks/delete.rs:23`, which corresponds to the `bench!` macro invocation. The actual error was propagated from inside the macro where `fork.delete(id)?` was called. |
| 25 | + |
| 26 | +### Root Cause |
| 27 | + |
| 28 | +The root cause was in `rust/src/transaction.rs`. The `Transaction` struct was intended to be a wrapper around the `Client` to provide transactional Neo4j operations. However, the implementation was a non-functional stub: |
| 29 | + |
| 30 | +**Original problematic code:** |
| 31 | +```rust |
| 32 | +// transaction.rs lines 55-74 |
| 33 | +fn create_links(&mut self, _query: &[T], handler: WriteHandler<T>) -> std::result::Result<Flow, Error<T>> { |
| 34 | + // Does nothing - just returns empty handler |
| 35 | + Ok(handler(Link::nothing(), Link::nothing())) |
| 36 | +} |
| 37 | + |
| 38 | +fn delete_links(&mut self, query: &[T], _handler: WriteHandler<T>) -> std::result::Result<Flow, Error<T>> { |
| 39 | + // Always returns NotExists error! |
| 40 | + Err(Error::NotExists(query[0])) |
| 41 | +} |
| 42 | +``` |
| 43 | + |
| 44 | +### Execution Flow |
| 45 | + |
| 46 | +1. **Create benchmark** runs first for both `Neo4j_NonTransaction` and `Neo4j_Transaction` |
| 47 | + - `Neo4j_NonTransaction`: Uses `Client` which properly creates links in Neo4j |
| 48 | + - `Neo4j_Transaction`: Uses `Transaction` which silently does nothing (returns empty handler) |
| 49 | + - Create benchmark "passes" for both because no error is returned |
| 50 | + |
| 51 | +2. **Delete benchmark** runs second |
| 52 | + - `Neo4j_NonTransaction`: Completes successfully (links exist, deletes work) |
| 53 | + - `Neo4j_Transaction`: Fails immediately because: |
| 54 | + - `create_links` didn't actually create any links |
| 55 | + - `delete_links` always returns `Err(Error::NotExists(id))` |
| 56 | + |
| 57 | +3. The `tri!` macro wraps the benchmark and calls `.unwrap()` on the result, causing the panic |
| 58 | + |
| 59 | +### Why Error was `NotExists(4000)`? |
| 60 | + |
| 61 | +The delete benchmark loops from ID 4000 down to 3000 in reverse order: |
| 62 | +```rust |
| 63 | +for id in (BACKGROUND_LINKS..=BACKGROUND_LINKS + 1_000).rev() { |
| 64 | + let _ = elapsed! {fork.delete(id)?}; |
| 65 | +} |
| 66 | +``` |
| 67 | + |
| 68 | +Where `BACKGROUND_LINKS = 3000`. So the first attempted delete is ID 4000, which immediately returns `Err(NotExists(4000))`. |
| 69 | + |
| 70 | +## Solution Implemented |
| 71 | + |
| 72 | +The fix properly implements `Transaction` to delegate all operations to the underlying `Client`: |
| 73 | + |
| 74 | +1. **Made Client API public**: Added public accessor methods and made response types public |
| 75 | + - `pub fn host(&self) -> &str` |
| 76 | + - `pub fn port(&self) -> u16` |
| 77 | + - `pub fn auth(&self) -> &str` |
| 78 | + - `pub fn constants(&self) -> &LinksConstants<T>` |
| 79 | + - `pub fn fetch_next_id(&self) -> i64` |
| 80 | + - `pub fn execute_cypher(&self, ...) -> Result<CypherResponse>` |
| 81 | + - Made `CypherResponse`, `QueryResult`, `RowData`, `CypherError` public |
| 82 | + |
| 83 | +2. **Rewrote Transaction**: Full delegation to Client for all Links/Doublets operations |
| 84 | + - `create_links`: Now properly creates links via Client's execute_cypher |
| 85 | + - `delete_links`: Now properly queries and deletes via Client |
| 86 | + - All other operations also properly delegated |
| 87 | + |
| 88 | +### Design Note |
| 89 | + |
| 90 | +As the comment in `transaction.rs` explains: |
| 91 | +> In the HTTP-based approach using `/db/neo4j/tx/commit` endpoint, all requests are auto-committed transactions. This wrapper exists for API compatibility to benchmark "transactional" Neo4j operations. |
| 92 | +
|
| 93 | +The Transaction and NonTransaction benchmarks will now produce similar results since the HTTP API auto-commits. For true transactional semantics, multi-request transaction endpoints would need to be used (see [Neo4j HTTP Transaction API](https://neo4j.com/docs/http-api/current/transactions/)). |
| 94 | + |
| 95 | +## Files Changed |
| 96 | + |
| 97 | +| File | Changes | |
| 98 | +|------|---------| |
| 99 | +| `rust/src/client.rs` | Made CypherResponse/QueryResult/RowData/CypherError public, added accessor methods | |
| 100 | +| `rust/src/transaction.rs` | Complete rewrite to delegate to Client | |
| 101 | + |
| 102 | +## Lessons Learned |
| 103 | + |
| 104 | +1. **Stub implementations should fail explicitly**: The original stub silently returned success for creates but explicit failure for deletes, causing confusing behavior. |
| 105 | + |
| 106 | +2. **Integration tests needed**: Unit tests for individual components would not have caught this since the stub "worked" in isolation. |
| 107 | + |
| 108 | +3. **CI logs are essential**: The CI logs clearly showed the exact error and location, enabling quick diagnosis. |
| 109 | + |
| 110 | +## References |
| 111 | + |
| 112 | +- [Issue #3](https://github.com/linksplatform/Comparisons.Neo4jVSDoublets/issues/3) |
| 113 | +- [PR #4](https://github.com/linksplatform/Comparisons.Neo4jVSDoublets/pull/4) |
| 114 | +- [Neo4j HTTP Transaction API](https://neo4j.com/docs/http-api/current/transactions/) |
| 115 | +- [Neo4j Cypher Transaction API](https://neo4j.com/docs/http-api/current/actions/) |
| 116 | + |
| 117 | +## CI Logs Archive |
| 118 | + |
| 119 | +The following CI run logs have been archived in this case study: |
| 120 | +- `logs/run-20431789056.log` - First failure |
| 121 | +- `logs/run-20433102564.log` - Second failure |
| 122 | +- `logs/run-20437258944.log` - Third failure (referenced in issue) |
0 commit comments