Skip to content

Conversation

@Unisay
Copy link
Contributor

@Unisay Unisay commented Sep 18, 2025

Summary

This PR implements cost modeling for Value-related builtins: lookupCoin, valueContains, valueData, and unValueData.

Implementation

Complete cost modeling pipeline:

  • Cost model infrastructure and parameter definitions
  • Benchmarking framework with realistic Cardano constraints
  • Statistical analysis with R models (linear/constant based on performance characteristics)
  • Updated JSON cost model configurations across all versions

Cost models:

  • valueData: Uses constant cost model based on uniform performance analysis
  • lookupCoin: Linear cost model with dimension reduction for 3+ parameters
  • valueContains: Linear cost model for container/contained size dependency
  • unValueData: Linear cost model for size-dependent deserialization

All functions now have proper cost models instead of unimplemented placeholders.

Recent Updates

ValueContains Optimization (commits d785dad, 149c182):

  • Replaced manual iteration with Map.isSubmapOfBy for 4.5x performance improvement
  • Benchmark results show slope reduction: 6548 → 1470 (4.5x faster per-operation)
  • Updated cost model parameters based on GitHub Actions benchmarking run

Benchmark Generation Optimization (commit cb6689b):

  • Optimized worst-case Value generation with 99.7% size reduction
  • Minimized off-path map sizes while maintaining worst-case lookup guarantees
  • Significantly faster benchmark execution without changing measurement accuracy

Visualization

Interactive cost model visualizations available at:
https://plutus.cardano.intersectmbo.org/cost-models/

To preview this PR's cost models, configure the data source to load from this branch:

  1. Open the visualization page for the function (e.g., /cost-models/valuecontains/)
  2. Update the data source URLs to point to this branch's raw files:
    • Benchmark data: https://raw.githubusercontent.com/IntersectMBO/plutus/yura/costing-builtin-value/plutus-core/cost-model/data/benching-conway.csv
    • Cost model: https://raw.githubusercontent.com/IntersectMBO/plutus/yura/costing-builtin-value/plutus-core/cost-model/data/builtinCostModelC.json
  3. The visualization will render this PR's updated cost model parameters

Available visualizations: lookupCoin, valueContains, valueData, unValueData

@Unisay Unisay self-assigned this Sep 18, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 18, 2025

PR Preview Action v1.6.2

🚀 View preview at
https://IntersectMBO.github.io/plutus/pr-preview/pr-7344/

Built to branch gh-pages at 2025-09-19 08:01 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@Unisay Unisay force-pushed the yura/costing-builtin-value branch 6 times, most recently from 528ebcd to 69f1d6f Compare September 24, 2025 16:06
@Unisay Unisay changed the title WIP: Add costing for lookupCoin and valueContains builtins Cost models for LookupCoin, ValueContains, ValueData, UnValueData builtins Sep 24, 2025
@Unisay Unisay marked this pull request as ready for review September 24, 2025 16:24
@Unisay Unisay requested review from ana-pantilie and kwxm September 24, 2025 16:41
@Unisay Unisay force-pushed the yura/costing-builtin-value branch 3 times, most recently from 53d9ea1 to 5b60cfc Compare September 30, 2025 10:15
@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 5b60cfc to 7eebe28 Compare October 2, 2025 09:43
Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some initial comments. I'll come back and add some more later. I need to look at the benchmarks properly though.

@Unisay Unisay force-pushed the yura/costing-builtin-value branch from b1a6bf1 to 6afef50 Compare October 9, 2025 14:11
@Unisay Unisay requested a review from zliu41 October 9, 2025 14:20
@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 3cee663 to 86d645a Compare October 10, 2025 10:26
Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to benchmark the worst case, I think you should also ensure that lookupCoin always hits the largest inner map (or at least, such cases should be well-represented).

Also, we'll need to re-run benchmarking for unValueData after adding the enforcement of integer range.

@@ -12094,203 +12094,710 @@ IndexArray/42/1,1.075506579052359e-6,1.0748433439930302e-6,1.0762684407023462e-6
IndexArray/46/1,1.0697135554442532e-6,1.0690902192698813e-6,1.0704133377013816e-6,2.2124820728450233e-9,1.8581237858977844e-9,2.6526943923047553e-9
IndexArray/98/1,1.0700747499373992e-6,1.0693842628239684e-6,1.070727062396803e-6,2.2506114869928674e-9,1.9376849028666025e-9,2.7564941558204088e-9
IndexArray/82/1,1.0755056682976695e-6,1.0750405368241111e-6,1.076102212770973e-6,1.8355219893844098e-9,1.5161640335164335e-9,2.4443625958006994e-9
Bls12_381_G1_multiScalarMul/1/1,8.232134704712041e-5,8.228195390475752e-5,8.23582682466318e-5,1.224261187989977e-7,9.011720721178711e-8,1.843107342917502e-7
Copy link
Contributor

@kwxm kwxm Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub seeems to think that the data for all of the BLS functions has changed, but I don't think they have.

Copy link
Contributor Author

@Unisay Unisay Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file on master contains Windows-style line terminators (\r\n) for BLS lines:

git show master:plutus-core/cost-model/data/benching-conway.csv | grep "Bls12_381_G1_multiScalarMul/1/1" | od -c | grep -C1 "\r"
0000000   B   l   s   1   2   _   3   8   1   _   G   1   _   m   u   l
0000020   t   i   S   c   a   l   a   r   M   u   l   /   1   /   1   ,
0000040   8   .   2   3   2   1   3   4   7   0   4   7   1   2   0   4
--
0000200   8   7   1   1   e   -   8   ,   1   .   8   4   3   1   0   7
0000220   3   4   2   9   1   7   5   0   2   e   -   7  \r  \n

This PR changes \r\n to \n .

@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 680af99 to bff4bf2 Compare October 13, 2025 10:42
Copy link
Contributor

@ana-pantilie ana-pantilie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only just understanding the philosophy behind costing, so I might be wrong. I think it's important we both generate random inputs and inputs which hit the various edge-cases the algorithm has. The benchmarking data we generate should be a sample which describes the algorithm's behavior as completely as possible.

@kwxm
Copy link
Contributor

kwxm commented Oct 14, 2025

I think it's important we both generate random inputs and inputs which hit the various edge-cases the algorithm has. The benchmarking data we generate should be a sample which describes the algorithm's behavior as completely as possible.

In general we want the costing function to describe the worst-case behaviour of the builtin, which means that the benchmarks should be run with inputs which produce worst-case behaviour. If we feed them totally random inputs then we'll fit a costing function which describes the average-case behaviour, and this could be dangerous. For example if a function takes 1ms on average but there are particular inputs which cause it to take 20ms then the cost model has to charge 20ms for all inputs so that we don't undercharge for scripts that do actually exercise the worst case (perhaps repeatedly) [An example of this kind of behaviour would be for equalsByteString, where we only benchmark the case where the two inputs are equal: if they're not equal (in particular if they have different lengths) then the function can return very quickly, but it they are equal the builtin will have to examine every byte of the input]. For some builtins the worst case won't be significantly worse than the average case, and then we miight be able to get away with random inputs, especially if it's hard to generate worst case inputs.

In the case of lookupCoin I'm not sure what the worst case is. It may well be when the thing you're looking up isn't present in the map, but we should check whether this is actually true.

Addendum. It's often necessary to do a lot of exploratory benchmarking to check that the builtin behaves as you're expecting, and this isn't generally reflected in the final costing benchmarks and costing function (for example, see the costing branches for expModInteger here and here, where I ran dozens of different benchmarks with different inputs before settling on the final version). So we're not just choosing a costing function and b,lindly fitting branhcmark results to it, we're checking that our initial assumptions are correct and modifying them if necessary. It might be useful to do this with the lookup/insertion/union costs as well.

@zliu41
Copy link
Member

zliu41 commented Oct 22, 2025

In the case of lookupCoin I'm not sure what the worst case is. It may well be when the thing you're looking up isn't present in the map, but we should check whether this is actually true.

Like what I said earlier, for lookupCoin the worst case would be: the currency hits the largest inner map, and the token does not exist in the inner map. Because this involves lookups in both the outer map and inner map. @Unisay Let's make sure that the case where the currency hits the largest inner map is well represented in the data.

@kwxm
Copy link
Contributor

kwxm commented Oct 22, 2025

In the case of lookupCoin I'm not sure what the worst case is. It may well be when the thing you're looking up isn't present in the map, but we should check whether this is actually true.

Like what I said earlier, for lookupCoin the worst case would be: the currency hits the largest inner map, and the token does not exist in the inner map. Because this involves lookups in both the outer map and inner map. @Unisay Let's make sure that the case where the currency hits the largest inner map is well represented in the data.

What about the case when the outer map is really big compared to the inner ones? Also, we could make all of the inner maps the same size. We probably don't want to be benchmarking anything that isn't the worst case.

@zliu41
Copy link
Member

zliu41 commented Oct 22, 2025

What about the case when the outer map is really big compared to the inner ones?

Isn't the worst case in that case still performing lookup in both the outer map and inner map? If you are concerned that the outer map key is not at the leaf, you can use either the smallest key or the largest key.

By the way, to make key comparisons more expensive, it would also be useful to fix first 30 or 31 bytes, and only vary the last byte.

@Unisay Unisay force-pushed the yura/costing-builtin-value branch from a8326df to 65eeca2 Compare November 5, 2025 11:43
where
-- Maximum budget for Value generation (30,000 bytes)
maxValueInBytes :: Int
maxValueInBytes = 30_000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is 30000 bytes the max? A Value can be created programmatically, in which case it is not limited by bytes, but limited by execution units.

This limit seems way to small. We should benchmark Values with minimum 10k entries if not 100k or more.

let prefixLen = Value.maxKeyLen - 4
prefix = BS.replicate prefixLen (0xFF :: Word8)
-- Encode the integer in big-endian format (last 4 bytes)
b0 = fromIntegral $ (n `shiftR` 24) .&. 0xFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simply generate 4 random bytes, instead of generating an integer and doing these bitwise operations.

[] -- no type arguments needed (monomorphic builtin)
(lookupCoinArgs gen) -- the argument combos to generate benchmarks for

lookupCoinArgs :: StdGen -> [(ByteString, ByteString, Value)]
Copy link
Contributor

@kwxm kwxm Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maps are implemented as balanced binary trees with (key, value) pairs at the internal nodes, so if you've got a node containing key k then the left subtree of the node will only contain keys less than k and the right subtree keys greater than k. The functions that operate on maps are supposed to keepthe tree balanced, so the left subtree should be (approximately) the same size as the right one. When you've got 2^n-1 nodes the tree should be prefectly balanced and every path from the root to a leaf should have length n (ie, you should pass through n nodes as you travel from the root to an entry both of whose subtrees are empty (Tip)).

I think that to get the worst case behaviour for lookupCoin you can generate an outer map with 2^a - 1 unique keys for a in some range like 1..10 or 1..15, so you get a full tree. Looking for the entry with the largest key should then require searching all the way to the bottom of the tree (always branching to the right), which should be the worst case (and also make sure that all of the keys have a long common prefix to maximise the comparison time). The inner map for this longest case should also be a full binary tree, with 2^b - 1 entries for some b; we want the worst case to search the inner map as well, and in this case that should happen when you look for a key that's bigger than all of the keys in the inner map, since again you'll have to search all the way down the right hand side of the tree; I think that if you search for the biggest key it'll take pretty much the same time though. The total time taken should be proportional to a+b, since you'll have so examine a nodes in the outer map and b in the inner map, and since the keys are of the same type for inner and outer maps the time taken per node should be about the same for both (if the key types in the inner and outer maps were different then you might be looking at something of the form ra+sb, but here r and s should be the same, so you've got r(a+b)).

Note that a and b are the depths of the trees, which wil be integerLog2 of the number of entries, so I think here you want to use a size measure which is integerLog2(outer size) + integerLog2(maximum inner size).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you'll need to benchmark this over some set of pairs of depths (a,b) where a and b both vary over [1..15] or something, but that might take a long time since for that range you'd be running 225 benchmarks. I think that in fact the time will only depend on a+b, but initially we should check that that's true by looking at different values of a and b with the same sum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be an overkill to bother generating full trees. In any balanced binary tree the depths of any two leaves differ by at most 1, so as long as we make sure that we hit a leaf node (using either the smallest key or the largest key), 🤷

It's more important to make sure the outer key hits the largest inner map. I think the worst case is: there's a large inner map with N/2 keys, together with N/2 singleton inner maps, and the outer key hits that large inner map. Whether the outer map and the inner map are full trees shouldn't matter that much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in summary, I would do this:

  • Given total size N, let there be roughly N/2 inner maps: one big inner map whose size is roughly N/2, and the rest are singletons.
  • Outer key should hit the big inner map.
  • Both the outer key and the inner key should hit leaf nodes. So both keys should be either the min or the max key in the respective map (or the inner key may be absent in the inner map).

This should be very close to the worst case, if not the worst case. I wouldn't bother with varying a and b, or generating full trees, or anything like that.

Copy link
Contributor

@kwxm kwxm Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bother with varying a and b

I think it's worth doing that at least once just to make sure that the time taken depends only on the sum of a and b . If we can show that it does then we can just restrict the benchmarking to the case when a = 1, so you only have one entry in the outer map; alternatively, you can take b = 1 and just worry about the size of the outer map. I think we can effectively regard the entire map as one big tree: when you get to the tip of the outer map you move into one of the inner maps and continue searching there, so it's like the inner maps are glued onto the leaves of the outer map. Then all that matters is the total depth, a+b. I'd like to check that assumption before going any further though: it's better to have some evidence than just to guess what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can show that it does then we can just restrict the benchmarking to the case when a = 1, so you only have one entry in the outer map; alternatively, you can take b = 1 and just worry about the size of the outer map.

You can't. lookupCoin is O(log max(m, k)). So it's better to balance the size of the outer map and the size of the largest inner map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, given a size measure of 10, you can:

  • Make the outer map have 1024 entries, and the largest inner map have 1024 entries
  • Or make the outer map have 1024 entries, and the largest inner map have 1 entry
  • Or make the outer map have 1 entry, and the largest inner map have 1024 entries

Obviously the first is the worst case

Copy link
Contributor

@kwxm kwxm Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't. lookupCoin is O(log max(m, k))

No, I don't think it is. I think it's O(log a + log b), or at least that log a + log b is a size measure that will give a more precise result than log(max(a, b) , and that's what my proposed experiment is trying to confirm. The O's are obscuring what's actually going on since they're hiding the details of the constants. For the sum the actual costing function will be of the form r + s(log a+ log b) and for the maximum it'll be of the form u + v*log(max(a, b)). Now log a+log b <= 2*max(log a, log b), with equality when a=b, so r+s*(log a + log b) <= r + 2*s*max(log a, log b), and when you put an O round them they become the same, but the right hand one can actually be almost twice the left hand one.

In other words, given a size measure of 10, you can:

Make the outer map have 1024 entries, and the largest inner map have 1024 entries
Or make the outer map have 1024 entries, and the largest inner map have 1 entry
Or make the outer map have 1 entry, and the largest inner map have 1024 entries

Obviously the first is the worst case

With the sum size measure the first map has a size of 20 and the other ones have a size of 11, but with the maximum they all have size 10, so the sum version picks out the worst case and the maximum doesn't. A stragiht line fitted to the benchmarking results using the sum measure should give us a more precise bound for the execution time than if we use the maximum measure. The sum is a more accurate measure of the total depth of a value than the maximum, and I think that the total depth is what we need to worry about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made benchmarking and here is what it shows (LLM summary):

Two-Level Map Lookup Performance: Experimental Analysis

  Experimental Setup

  Data structure: Map ByteString (Map ByteString Natural)

  Key characteristics:
  - 32-byte ByteStrings with 28-byte common prefix (simulating hash-like keys)
  - Perfect binary tree structure (keys = 2^n - 1)
  - Worst-case lookup: deepest key in both outer and inner maps

  Test parameters:
  - Outer map depth: a (containing 2^a - 1 keys)
  - Inner map depth: b (containing 2^b - 1 keys)
  - Total depth: N = a + b

  ---
  Experiment 1: Distribution Impact (Constant Total Depth)

  Hypothesis: When N is held constant, how does the distribution of depth between outer and inner maps
  affect performance?

  Test configuration: N = 17, varying distributions from (1,16) to (16,1)

  Results: Distribution Has Minimal Impact

  | Distribution   | Outer Keys | Inner Keys | Lookup Time | Deviation from Mean |
  |----------------|------------|------------|-------------|---------------------|
  | (10,7)         | 1,023      | 127        | 1.657 μs    | +1.5%               |
  | (6,11)         | 63         | 2,047      | 1.655 μs    | +1.4%               |
  | (8,9) balanced | 255        | 511        | 1.644 μs    | +0.7%               |
  | (1,16) extreme | 1          | 65,535     | 1.607 μs    | -1.6%               |
  | (16,1) extreme | 65,535     | 1          | 1.598 μs    | -2.1%               |

  Range: 1.598 μs to 1.657 μs (3.6% variation)

  Key finding: Distribution choice has negligible impact on performance when total depth is constant.

  ---
  Experiment 2: Linear Scaling with Total Depth

  Hypothesis: Lookup time scales linearly with total depth N = a + b, regardless of distribution.

  Test configuration: N ∈ {10, 12, 14, 16, 18, 20}, three representative distributions per N

  Results: Strong Linear Relationship

  | N (Total Depth) | Representative Samples  | Min Time | Max Time | Average Time | Cost per Level |
  |-----------------|-------------------------|----------|----------|--------------|----------------|
  | 10              | (1,9), (5,5), (9,1)     | 125.6 ns | 128.3 ns | 127.0 ns     | —              |
  | 12              | (1,11), (6,6), (11,1)   | 132.9 ns | 142.9 ns | 136.9 ns     | +4.95 ns       |
  | 14              | (1,13), (7,7), (13,1)   | 140.1 ns | 154.0 ns | 144.5 ns     | +3.80 ns       |
  | 16              | (1,15), (8,8), (15,1)   | 145.3 ns | 160.4 ns | 151.1 ns     | +3.30 ns       |
  | 18              | (1,17), (9,9), (17,1)   | 152.7 ns | 167.6 ns | 158.6 ns     | +3.75 ns       |
  | 20              | (1,19), (10,10), (19,1) | 158.8 ns | 173.9 ns | 168.8 ns     | +5.10 ns       |

  Average cost per level: 4.18 ns

  Linear model: Time ≈ 80 ns + 4.2 ns × N (R² ≈ 0.99)

  ---
  Scaling Analysis

  | Test                   | Expected (if linear) | Observed    | Result     |
  |------------------------|----------------------|-------------|------------|
  | N=10 → N=20 (2x depth) | 2.00x time           | 1.33x time  | Sub-linear |
  | Per-level increment    | Constant             | 4.18 ns avg | ✅ Constant |

  Note: The sub-linear scaling (1.33x instead of 2x) suggests the baseline overhead is significant
  relative to per-level cost at small N values.

  ---
  Key Experimental Tendencies

  1. Distribution Independence (Constant N)
    - Variation between distributions: <4%
    - Expensive key comparisons dominate performance
    - Tree shape/cache effects are negligible
  2. Linear Depth Scaling
    - Each additional tree level: +4.2 ns
    - Baseline overhead: ~80 ns
    - Strong linear correlation (R² ≈ 0.99)
  3. Distribution Variation by Depth

  | Total Depth (N) | Max-Min Spread | % Variation |
  |-----------------|----------------|-------------|
  | 10              | 2.7 ns         | 2.1%        |
  | 12              | 10.0 ns        | 7.3%        |
  | 14              | 13.9 ns        | 9.6%        |
  | 16              | 15.1 ns        | 10.0%       |
  | 18              | 14.9 ns        | 9.4%        |
  | 20              | 15.1 ns        | 8.9%        |

  Pattern: Variation increases slightly at deeper depths but remains <10%

  ---
  Conclusions

  1. For expensive key comparisons (ByteStrings with common prefixes):
    - Lookup time is primarily determined by total depth (a + b)
    - Distribution choice (split between outer/inner) has minimal impact
  2. Performance model:
  Lookup Time ≈ 80 ns + 4.2 ns × (a + b)
  3. Practical implication:
    - When designing nested map structures with expensive keys, optimize for total depth minimization
  rather than specific distribution patterns
    - Tree balancing and cache optimization are secondary concerns
  4. Cost breakdown:
    - ~80 ns: Fixed overhead (function calls, setup)
    - ~4.2 ns per level: Key comparison + tree traversal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with using log m + log k instead of log (max m k). I don't think it really matters either way, but let's just make a decision and move forward with getting the costing done.

We need to stay on schedule for HF by EOY, and it is a firm deadline. If both @kwxm and @Unisay prefers log m + log k then go ahead with it.

@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 5b8d02e to bfd577b Compare November 11, 2025 11:16
@Unisay Unisay requested a review from ana-pantilie November 11, 2025 14:13
Introduce new ExMemoryUsage wrapper that measures Value cost as the sum of logarithmic sizes (log of outer map size + log of max inner map size).

This reflects the O(log m + log k) cost behavior of two-level map lookups, based on experimental evidence showing lookup time scales linearly with the sum of depths rather than their maximum.

Modernizes integerLog2 usage by switching from GHC.Integer.Logarithms to GHC.Num.Integer.
Add ValueTotalSize and ValueLogOuterSizeAddLogMaxInnerSize to the DefaultUni builtin type system, enabling these wrappers to be used in builtin function signatures.

Both wrappers are coercions of the underlying Value type with specialized memory measurement behavior.
Add cost model parameters for four new Value-related builtins: LookupCoin (3 arguments), ValueContains (2 arguments), ValueData (1 argument), and UnValueData (1 argument).

Updates BuiltinCostModelBase type, memory models, cost model names, and unit cost models. Prepares infrastructure for actual cost models to be fitted from benchmarks.
Apply memory wrappers and cost model parameters to Value builtin denotations. LookupCoin wraps Value with ValueLogOuterSizeAddLogMaxInnerSize, ValueContains uses the wrapper for container and ValueTotalSize for contained value.

Replaces unimplementedCostingFun with actual cost model parameters. Updates golden type signatures to reflect wrapper types.
Add systematic benchmarking framework with worst-case test coverage: LookupCoin with 400 power-of-2 combinations testing BST depth range 2-21, ValueContains with 1000+ cases using multiplied_sizes model for x * y complexity.

Includes R statistical models: linearInZ for LookupCoin, multiplied_sizes for ValueContains to properly account for both container and contained sizes.
Update all three cost model variants (A, B, C) with parameters fitted from comprehensive benchmark runs. Includes extensive timing data covering full parameter ranges for all four Value builtins.

Models derived from remote benchmark runs on dedicated hardware with systematic worst-case test coverage ensuring conservative on-chain cost estimates.
Update test expectations across the codebase to reflect refined cost models: conformance test budgets (8 cases), ParamName additions for V1/V2/V3 ledger APIs (11 new params per version), param count tests, cost model registrations, and generator support.

All updates reflect the transition from placeholder costs to fitted models.
Document the addition of fitted cost model parameters for Value-related builtins based on comprehensive benchmark measurements.
@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 99d05eb to 37f29be Compare November 13, 2025 18:29
Fix bug where worst-case entry could be duplicated in selectedEntries when it appears at a low position in allEntries (which happens for containers with small tokensPerPolicy values).

The issue occurred because the code took the first N-1 entries from allEntries and then appended worstCaseEntry, without checking if worstCaseEntry was already included in those first N-1 entries. For containers like 32768×2, the worst-case entry (policy[0], token[1]) is at position 1, so it was included in both the "others" list and explicitly appended, creating a duplicate.

Value.fromList deduplicates entries, resulting in benchmarks with one fewer entry than intended (e.g., 99 instead of 100), producing incorrect worst-case measurements.

Solution: Filter out worstCaseEntry from allEntries before taking the first N-1 entries, ensuring it only appears once at the end of the selected entries list.
Replace manual iteration + lookupCoin implementation with Data.Map.Strict's
isSubmapOfBy, which provides 2-4x performance improvement through:

- Parallel tree traversal instead of n₂ independent binary searches
- Better cache locality from sequential traversal
- Early termination on first mismatch
- Reduced function call overhead

Implementation change:
- Old: foldrWithKey + lookupCoin for each entry (O(n₂ × log(max(m₁, k₁))))
- New: isSubmapOfBy (isSubmapOfBy (<=)) (O(m₂ × k_avg) with better constants)

Semantic equivalence verified:
- Both check v2 ⊆ v1 using q2 ≤ q1 for all entries
- All plutus-core-test property tests pass (99 tests × 3 variants)
- Conformance tests show expected budget reduction (~50% CPU cost reduction)

Next steps:
- Re-benchmark with /costing:remote to measure actual speedup
- Re-fit cost model parameters (expect slope reduction from 6548 to ~1637-2183)
- Update conformance test budget expectations after cost model update

Credit: Based on optimization discovered by Kenneth.
Optimize generateConstrainedValueWithMaxPolicy to minimize off-path
map sizes while maintaining worst-case lookup guarantees:

1. Sort keys explicitly to establish predictable BST structure
2. Select maximum keys (last in sorted order) for worst-case depth
3. Populate only target policy with full token set (tokensPerPolicy)
4. Use minimal maps (1 token) for all other policies

Impact:
- 99.7% reduction in benchmark value size (524K → 1.5K entries)
- ~340× faster map construction during benchmark generation
- ~99.7% memory reduction (52 MB → 150 KB per value)
- Zero change to cost measurements (worst-case preserved)

Affects: LookupCoin, ValueContains benchmarks

Formula: totalEntries = tokensPerPolicy + (numPolicies - 1)
Example: 1024 policies × 512 tokens = 1,535 entries (was 524,288)

Rationale: BST lookups only traverse one path from root to leaf.
Off-path policies are never visited, so their inner map sizes don't
affect measurement. Reducing off-path maps from tokensPerPolicy to 1
eliminates 99.7% of irrelevant data without changing worst-case cost.

Technical details:
- ByteString keys already use worst-case comparison (28-byte prefix)
- Sorting + last selection guarantees maximum BST depth (rightmost leaf)
- Target policy still has full token set for worst-case inner lookup
- Validates correct behavior: build succeeds, benchmarks run normally
…ization

Update benchmark data and cost model parameters based on optimized
valueContains implementation using Map.isSubmapOfBy.

Benchmark results show significant performance improvement:
- Slope: 6548 → 1470 (4.5x speedup in per-operation cost)
- Intercept: 1000 → 1,163,050 (increased fixed overhead)

The slope reduction confirms the 3-4x speedup observed in local testing.
Higher intercept may reflect actual setup overhead in isSubmapOfBy or
statistical fitting on the new benchmark distribution.

Benchmark data: 1023 ValueContains measurements from GitHub Actions run
19367901303 testing the optimized implementation.
@Unisay Unisay enabled auto-merge (squash) November 14, 2025 19:05
@Unisay Unisay requested review from kwxm and zliu41 November 14, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants