Skip to content

Comments

perf(builtins): optimize listToArray implementation and costing#7468

Merged
zeme-wana merged 11 commits intomasterfrom
yura/list-to-array-improvements
Dec 3, 2025
Merged

perf(builtins): optimize listToArray implementation and costing#7468
zeme-wana merged 11 commits intomasterfrom
yura/list-to-array-improvements

Conversation

@Unisay
Copy link
Contributor

@Unisay Unisay commented Nov 28, 2025

Context

  • What: Optimize listToArray builtin implementation and update cost model parameters
  • Why: Improve performance and accuracy of list-to-array conversions in Plutus Core

Approach

This PR makes two key optimizations to the listToArray builtin:

1. Implementation Optimization: Changed from Vector.fromList to Vector.fromListN which avoids repeated array resizing during conversion. Since we already traverse the list to calculate memory usage, we know the length upfront and can allocate the exact size needed.

2. Memory Usage Costing: Changed list memory costing from eager full-spine traversal to batched incremental costing. Instead of computing length l upfront (which forces the entire list spine), we now process the list in batches of 100 elements, producing CostRose nodes incrementally. This is more efficient for large lists while maintaining accurate costing.

The benchmark range was also expanded from 1-100 to 1-5000 elements to capture the linear cost behavior across a wider range of inputs, resulting in updated cost model parameters (lower intercept, higher slope).

Changes

Core Implementation

  • Use Vector.fromListN instead of Vector.fromList for more efficient array allocation
  • Implement batched memory costing for lists using CostRose tree structure

Cost Model Updates

  • Updated listToArray CPU cost parameters based on new benchmarks
  • Intercept reduced from 307802 to 1000
  • Slope increased from 8496 to 24619
  • Reflects the optimized implementation characteristics
  • Regenerated indexArray and lengthOfArray CPU costs (were stale since 69cb3d9)

Cost Model Visualization Fix

  • Fixed overhead calculation in JS to use Nop*o pattern (matching R's models.R)

Cost Model presented visually

https://plutus.cardano.intersectmbo.org/pr-preview/cost-models/pr-7468/listtoarray/index.html?branch=yura%2Flist-to-array-improvements

Benchmarking

  • Expanded benchmark range to 1-5000 elements for better linear model fitting

Documentation

  • Added cost model visualization pages for all three array builtins:
    • ListToArray (linear cost model)
    • LengthOfArray (constant cost)
    • IndexArray (constant cost)
  • Updated navigation across all existing visualization pages

How to Test

Implementation Testing

  1. Run the test suite: cabal test plutus-core-test
  2. Run conformance tests (expect budget differences due to cost model changes)

Cost Model Verification

  1. Start local server: cd doc/cost-models && python -m http.server 8000
  2. Visit http://localhost:8000/listtoarray/
  3. Verify benchmark data points align with model predictions
  4. Check LengthOfArray and IndexArray pages show constant cost behavior

Benchmark Reproduction

cabal run plutus-core:cost-model-budgeting-bench -- ListToArray

Author's Checklist

  • Implementation uses more efficient Vector.fromListN
  • Memory costing uses batched incremental approach
  • Cost model parameters updated from new benchmarks
  • Visualization pages added for array builtins
  • Conformance test budget updates reviewed
  • Fixed stale indexArray/lengthOfArray cost models
  • Fixed JS visualization overhead calculation

Improve listToArray builtin performance and costing precision:

- Use Vector.fromListN instead of Vector.fromList to allocate exact
  array size upfront, avoiding incremental resizing overhead
- Implement batched ExMemoryUsage for lists that processes spine in
  chunks of 100 elements for more efficient cost computation
- Expand benchmark coverage to 1-5000 element lists for better cost
  model fitting across a wider range of input sizes
Update listToArray cost model based on new benchmark results. The new
implementation has lower base cost (intercept: 1000 vs 307802) but
higher per-element cost (slope: 24619 vs 8496), reflecting the
optimized algorithm that processes elements more efficiently with
lower overhead.
Add interactive cost model visualization pages for the three array
builtin functions: ListToArray, LengthOfArray, and IndexArray. Each
page displays benchmark data alongside fitted cost model predictions,
following the established pattern for existing function visualizations.
Update expected CPU budgets for listToArray conformance tests to
reflect the updated cost model parameters (lower intercept, higher
slope).
@Unisay Unisay self-assigned this Nov 28, 2025
@Unisay Unisay added the No Changelog Required Add this to skip the Changelog Check label Nov 28, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 28, 2025

PR Preview Action v1.6.2

🚀 View preview at
https://IntersectMBO.github.io/plutus/pr-preview/cost-models/pr-7468/

Built to branch gh-pages at 2025-12-02 11:08 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data here still only shows lists of lengths up to 100? And the formula is the old one?

@zliu41
Copy link
Member

zliu41 commented Nov 29, 2025

Intercept reduced from 307802 to 1000
Slope increased from 8496 to 24619

How could the slope increase so dramatically? These numbers imply that for length > 20, the new cost will be much more expensive than the old one. Is that what you expected?

@Unisay
Copy link
Contributor Author

Unisay commented Dec 1, 2025

The data here still only shows lists of lengths up to 100? And the formula is the old one?

Its important to use proper branch in the data source:
image

The link in the PR description does specify the branch to use, so if you follow that link it should be set automatically.

@Unisay
Copy link
Contributor Author

Unisay commented Dec 1, 2025

Intercept reduced from 307802 to 1000
Slope increased from 8496 to 24619

How could the slope increase so dramatically? These numbers imply that for length > 20, the new cost will be much more expensive than the old one. Is that what you expected?

The old benchmark only tested lists up to 100 elements, so the high intercept (307802) was compensating for an underestimated slope. With 1-5000 elements, the regression captures the actual per-element cost. Break-even is around n≈19, so yes, longer lists cost more than before, but that's because the old model was undercharging by extrapolating from too little data.

Extract Peano numbers, NatToPeano type family, and static unrolling utilities into a dedicated module. This consolidates type-level machinery for compile-time loop unrolling previously duplicated in StepCounter.

Provides Drop class for statically unrolled list operations and UpwardsM class for statically unrolled effectful iteration.
Remove duplicated Peano, NatToPeano, and UpwardsM definitions from StepCounter and import them from PlutusCore.Unroll instead. Reduces code duplication while maintaining identical runtime behavior.
Replace splitAt with dropN in the ExMemoryUsage instance for lists. The dropN function uses compile-time instance resolution to unroll the drop operation, avoiding tuple and prefix list allocation that splitAt requires.
…timation

- Changed the slope value from 24619 to 24838 in builtinCostModelA.json
- Changed the slope value from 24619 to 24838 in builtinCostModelB.json
- Changed the slope value from 24619 to 24838 in builtinCostModelC.json
Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Adjust budget expectation for listToArray-02 test case to reflect the
updated CPU slope in the cost model (286671 → 288642 cpu).
@Unisay Unisay force-pushed the yura/list-to-array-improvements branch from 8ce3c78 to baf0499 Compare December 2, 2025 10:18
Match R's models.R which uses Opaque (Nop*o) benchmarks for overhead
calculation. The JS visualization was incorrectly using Nop*b pattern.
These values were incorrect since commit 69cb3d9. Regenerated using
generate-cost-model tool from current benching-conway.csv data.
@kwxm
Copy link
Contributor

kwxm commented Dec 3, 2025

The data here still only shows lists of lengths up to 100? And the formula is the old one?

Its important to use proper branch in the data source: image

The link in the PR description does specify the branch to use, so if you follow that link it should be set automatically.

I was confused by this as well, and the link in the PR description is only displaying sizes up to 100. It looks as if it's still displaying the old data because the new data doesn't have many points with x <= 100. Maybe I've missed out some step?

@zeme-wana zeme-wana merged commit 4fcf1a3 into master Dec 3, 2025
4 of 5 checks passed
@zeme-wana zeme-wana deleted the yura/list-to-array-improvements branch December 3, 2025 10:00
@kwxm kwxm added Builtins Costing Anything relating to costs, fees, gas, etc. labels Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Builtins Costing Anything relating to costs, fees, gas, etc. No Changelog Required Add this to skip the Changelog Check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants