Skip to content

Commit 078a6f9

Browse files
committed
update to new algorithm
1 parent c694155 commit 078a6f9

File tree

2 files changed

+163
-152
lines changed

2 files changed

+163
-152
lines changed
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
simd: '0186'
3+
title: Loaded Transaction Data Size Specification
4+
authors:
5+
- Hanako Mumei
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2024-10-20
10+
feature: (fill in with feature tracking issues once accepted)
11+
---
12+
13+
## Summary
14+
15+
Before a transaction can be executed, every account it may read from or write to
16+
must be loaded, including any programs it may call. The amount of data a
17+
transaction is allowed to load is capped, and if it exceeds that limit, loading
18+
is aborted. This functionality is already implemented in the validator. The
19+
purpose of this SIMD is to explicitly define how loaded transaction data size is
20+
calculated.
21+
22+
## Motivation
23+
24+
Transaction data size accounting is currently unspecified, and the
25+
implementation-defined algorithm used in the Agave client exhibits some
26+
surprising behaviors:
27+
28+
* BPF loaders required by instructions' program IDs are counted against
29+
transaction data size. BPF loaders required by CPI programs are not. If a
30+
required BPF loader is also included in the accounts list, it is counted twice.
31+
* The size of a program owned by LoaderV3 may or may not include the size of its
32+
programdata depending on how the program account is used on the transaction.
33+
Programdata is also itself counted if included in the transaction accounts list.
34+
This means programdata may be counted zero, one, or two times per transaction.
35+
* Due to certain quirks of implementation, loader-owned accounts which do not
36+
contain valid programs for execution may or may not be counted against the
37+
transaction data size total depending on how they are used on the transaction.
38+
This includes, but is not limited to, LoaderV3 buffer accounts, and accounts
39+
which fail ELF validation.
40+
* Accounts can be included on a transaction account list without being an
41+
instruction account, fee-payer, or program ID. These accounts are presently
42+
loaded and counted against transaction data size, although they can never be
43+
used for any purpose by the transaction.
44+
45+
All validator clients must arrive at precisely the same transaction data size
46+
for all transactions because a difference of one byte can determine whether a
47+
transaction is executed or failed, and thus affects consensus. Also, we want the
48+
calculated transaction data size to correspond well with the actual amount of
49+
data the transaction requests.
50+
51+
Therefore, this SIMD seeks to specify an algorithm that is straightforward to
52+
implement in a client-agnostic way, while also accurately accounting for all
53+
account data required by the transaction.
54+
55+
## New Terminology
56+
57+
No new terms are introduced by this SIMD, however we define these for clarity:
58+
59+
* Instruction account: an account passed to an instruction in its accounts
60+
array, which allows the program to view the actual bytes contained in the
61+
account. CPI can only happen through programs provided as instruction accounts.
62+
* Transaction accounts list: all accounts for the transaction, which includes
63+
instruction accounts, the fee-payer, program IDs, and any extra accounts added
64+
to the list but not used for any purpose.
65+
* LoaderV3 program account: an account owned by
66+
`BPFLoaderUpgradeab1e11111111111111111111111` which contains in its account data
67+
the first four bytes `02 00 00 00` followed by a pubkey which points to an
68+
account which is defined as the program's programdata account.
69+
70+
For the purposes of this SIMD, we make no assumptions about the contents of the
71+
programdata account.
72+
73+
## Detailed Design
74+
75+
The proposed algorithm is as follows:
76+
77+
1. Given a transaction, take the unique set of account keys which are used as:
78+
* An instruction account.
79+
* A program ID for an instruction.
80+
* The fee-payer.
81+
2. Each account's size is determined solely by the byte length of its data prior
82+
to transaction execution.
83+
3. For any `LoaderV3` program account, add the size of the programdata account
84+
it references, if it exists.
85+
4. The total transaction size is the sum of these sizes.
86+
87+
Transactions may include a
88+
`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define
89+
a data size limit for the transaction. Otherwise, the default limit is 64MiB
90+
(`64 * 1024 * 1024` bytes).
91+
92+
If a transaction exceeds its data size limit, the transaction is failed. Fees
93+
will be charged once `enable_transaction_loading_failure_fees` is enabled.
94+
95+
Adding required loaders to transaction data size is abolished. They are treated
96+
the same as any other account: counted if used in a manner described by 1, not
97+
counted otherwise.
98+
99+
No account that falls outside of the three categories listed by 1 is counted
100+
against transaction data size. Validator clients are free to decline to load
101+
them.
102+
103+
Read-only and writable accounts are treated the same. In the future, when direct
104+
mapping is enabled, this SIMD may be amended to count them differently.
105+
106+
As a consequence of 1 and 3, for LoaderV3 programs, programdata is counted twice
107+
if a transaction explicitly references the program account and its programdata
108+
account. This is done partly for simplicity, and partly to account for the cost
109+
of maintaining the compiled program in addition to the actual bytes of
110+
the programdata account.
111+
112+
We include programdata size in account size for LoaderV3 programs because using
113+
the program account on a transaction forces an unconditional load of programdata
114+
to compile the program for execution. We always count it, even when the program
115+
is an instruction account, because the program must be available for CPI.
116+
117+
There is no special handling for any account owned by the native loader,
118+
LoaderV1, or LoaderV2.
119+
120+
Account size for programs owned by LoaderV4 is left undefined. This SIMD should
121+
be amended to define the required semantics before LoaderV4 is enabled on any
122+
network.
123+
124+
## Alternatives Considered
125+
126+
* Transaction data size accounting is already enabled, so the null option is to
127+
enshrine the current Agave behavior in the protocol. This is undesirable because
128+
the current behavior is highly idiosyncratic, and LoaderV3 program sizes are
129+
routinely undercounted.
130+
* Builtin programs are backed by accounts that only contain the program name as
131+
a string, typically making them 15-40 bytes. We could impose a larger fixed cost
132+
for these. However, they must be made available for all programs anyway, and
133+
most of them are likely to be ported to BPF eventually, so this adds complexity
134+
for no real benefit.
135+
* Several slightly different algorithms were considered for handling LoaderV3
136+
programs in particular, for instance only counting programs that are valid for
137+
execution in the current slot. However, this would implicitly couple transaction
138+
data size with the results of ELF validation, which is highly undesirable.
139+
* We considered loading and counting sizes for accounts on the transaction
140+
account list which are not used for any purpose. This is the current behavior,
141+
but there is no reason to load such accounts at all.
142+
143+
## Impact
144+
145+
The primary impact is this SIMD makes correctly implementing transaction data
146+
size accounting much easier for other validator clients.
147+
148+
It makes the calculated size of transactions which include program accounts for
149+
CPI somewhat larger, but given the generous 64MiB limit, it is unlikely that any
150+
existing users will be affected. Based on an investigation of a 30-day window,
151+
transactions larger than 30MiB are virtually never seen.
152+
153+
## Security Considerations
154+
155+
Security impact is minimal because this SIMD merely simplifies an existing
156+
feature. Care must be taken to implement the rules exactly.
157+
158+
This SIMD requires a feature gate.
159+
160+
## Backwards Compatibility
161+
162+
Transactions that currently have a total transaction data size close to the
163+
64MiB limit, which call LoaderV3 programs via CPI, may now exceed it and fail.

proposals/0186-transaction-data-size-specification.md

Lines changed: 0 additions & 152 deletions
This file was deleted.

0 commit comments

Comments
 (0)