Skip to content

Conversation

@amesgen
Copy link
Member

@amesgen amesgen commented Aug 21, 2024

Follow-up to #1168 that makes sure that adding a tx exceeding the per-tx limit does not cause a deadlock which prevents txs from being added to the mempool until the node is restarted.

We accomplish this by validating such transactions and relying on the per-tx limit to reject them.

@amesgen amesgen force-pushed the amesgen/mempool-large-tx branch from 5061ad5 to 501d885 Compare August 21, 2024 15:05
Comment on lines 191 to 202
maxTotalRefScriptSize = 1024 * 1024 -- 1MiB
, curTotalRefScriptSize + newTxRefScriptSize Prelude.<= maxTotalRefScriptSize
mempoolStaysBelowCapacity =
curTotalRefScriptSize + newTxRefScriptSize Prelude.<= maxTotalRefScriptSize
-- In case the tx exceeds the per-tx limit, let it be rejected by tx
-- validation (such that we are not blocked here forever/for a long
-- time).
--
-- For Babbage, this is 100KiB (see @totalRefScriptsSizeLimit@ in
-- "Ouroboros.Consensus.Shelley.Eras"), and for Conway, this is 200KiB
-- (see @maxRefScriptSizePerTx@ in "Cardano.Ledger.Conway.Rules.Ledger").
txRefScriptSizeTooLarge = newTxRefScriptSize Prelude.> 200 * 1024
, mempoolStaysBelowCapacity || txRefScriptSizeTooLarge
Copy link
Member Author

@amesgen amesgen Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtlety: It would also be enough to prevent a hard deadlock by replacing 200 * 1024 with maxTotalRefScriptSize. However, this is still problematic in that honest nodes can be forced to forge very small blocks:

  • Suppose someone first submits a tx A using a very small (eg just one byte) referene script, and then one tx B that includes 1MiB of reference scripts.
  • The tx A would enter the mempool, then potentially a few other txs from other peers, and then adding tx B would block, causing no new transactions to be added, until...
  • ...an SPO mints a block, all txs in the mempool are removed, and the tx B can now be rejected. The problem is that the resulting block only contains very few txs.

The approach taken here makes sure that this can only happen when the mempool already contains 1024 - 200 = 824 KiB of ref scripts, which is already larger than basically all "normal" blocks using reference scripts. This is also similar to how #1175 handles this case.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Only one suggestion for a small addition in the comment, regarding overflow.

@amesgen amesgen added this pull request to the merge queue Aug 23, 2024
@amesgen amesgen removed this pull request from the merge queue due to a manual request Aug 23, 2024
@amesgen amesgen force-pushed the amesgen/mempool-large-tx branch from 501d885 to 7a2a047 Compare August 23, 2024 12:50
@amesgen amesgen enabled auto-merge August 23, 2024 12:51
@amesgen amesgen added this pull request to the merge queue Aug 23, 2024
Merged via the queue into main with commit 9351e28 Aug 23, 2024
@amesgen amesgen deleted the amesgen/mempool-large-tx branch August 23, 2024 13:38
github-merge-queue bot pushed a commit that referenced this pull request Sep 24, 2024
Closes #1226

In addition to the previously valid/invalid txs (purely based on the
UTxO ledger rules), we add an optional per-tx limit to the mock block.

As a second step, we generate very large txs that are larger than an
entire mempool, in order to test that we do *not* block when adding them
(just like the other txs), which is important as explained in #1226.

---

One way to validate this PR is to introduce a bug that would cause us to
block on such transactions, and observe that the test now indeed catches
that.

For example, retrying when the per-tx limited is not satisfied (this is
basically what happened in #1168 and was fixed by #1225) via
```diff
diff --git a/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Update.hs b/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Update.hs
index 372ea15..31abe25fbf 100644
--- a/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Update.hs
+++ b/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Update.hs
@@ -170,25 +170,8 @@ pureTryAddTx ::
   -> TryAddTx blk
 pureTryAddTx cfg wti tx is =
   case runExcept $ txMeasure cfg (isLedgerState is) tx of
-    Left err ->
-      -- The transaction does not have a valid measure (eg its ExUnits is
-      -- greater than what this ledger state allows for a single transaction).
-      --
-      -- It might seem simpler to remove the failure case from 'txMeasure' and
-      -- simply fully validate the tx before determining whether it'd fit in
-      -- the mempool; that way we could reject invalid txs ASAP. However, for a
-      -- valid tx, we'd pay that validation cost every time the node's
-      -- selection changed, even if the tx wouldn't fit. So it'd very much be
-      -- as if the mempool were effectively over capacity! What's worse, each
-      -- attempt would not be using 'extendVRPrevApplied'.
-      TryAddTx
-        Nothing
-        (MempoolTxRejected tx err)
-        (TraceMempoolRejectedTx
-         tx
-         err
-         (isMempoolSize is)
-        )
+    Left _err ->
+      NotEnoughSpaceLeft
     Right txsz
       -- Check for overflow
       --
```
or here
```diff
diff --git a/ouroboros-consensus/src/unstable-mock-block/Ouroboros/Consensus/Mock/Ledger/Block.hs b/ouroboros-consensus/src/unstable-mock-block/Ouroboros/Consensus/Mock/Ledger/Block.hs
index 743e11b..e01d8cfe5a 100644
--- a/ouroboros-consensus/src/unstable-mock-block/Ouroboros/Consensus/Mock/Ledger/Block.hs
+++ b/ouroboros-consensus/src/unstable-mock-block/Ouroboros/Consensus/Mock/Ledger/Block.hs
@@ -452,10 +452,7 @@ instance TxLimits (SimpleBlock c ext) where
   -- But not 'maxbound'!, since the mempool sometimes holds multiple blocks worth.
   blockCapacityTxMeasure _cfg _st = IgnoringOverflow simpleBlockCapacity

-  txMeasure cfg _st =
-        fmap IgnoringOverflow
-      . checkTxSize (simpleLedgerMockConfig cfg)
-      . simpleGenTx
+  txMeasure _cfg _st = pure . IgnoringOverflow . txSize . simpleGenTx

 simpleBlockCapacity :: ByteSize32
 simpleBlockCapacity = ByteSize32 512
```
will cause many test failures with `FailureDeadlock [Labelled (ThreadId
[]) (Just "main")]'` via `io-sim`'s nice deadlock detection.

---

Stacked on top of #1175 to avoid boring rebase work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants