Skip to content
Open
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 156 additions & 0 deletions proposals/0177-program-runtime-abiv2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
---
simd: '0177'
title: Program Runtime ABI v2
authors:
- Alexander Meißner
category: Standard
type: Core
status: Idea
created: 2025-02-23
feature: TBD
extends: SIMD-0219
---

## Summary

Align the layout of the virtual address space to large pages in order to
simplify the address translation logic and allow for easy direct mapping.

## Motivation

At the moment all validator implementations have to copy (and compare) data in
and out of the virtual memory of the virtual machine. There are four possible
account data copy paths:

- Serialization: Copy from program runtime (host) to virtual machine (guest)
- CPI call edge: Copy from virtual machine (guest) to program runtime (host)
- CPI return edge: Copy from program runtime (host) to virtual machine (guest)
- Deserialization: Copy from virtual machine (guest) to program runtime (host)

To avoid this a feature named "direct mapping" was designed which uses the
address translation logic of the virtual machine to emulate the serialization
and deserialization without actually performing copies.

Implementing direct mapping in the current ABI v0 and v1 is very complex
because of unaligned virtual memory regions and memory accesses overlapping
multiple virtual memory regions. Instead the layout of the virtual address
space should be adjusted so that all virtual memory regions are aligned to
4 GiB.

## Alternatives Considered

None.

## New Terminology

None.

## Detailed Design

Programs signal their support through their SBPF version field being v4 or
above while the program runtime signals which ABI is chosen through the
serialized magic field.

### Per Transaction Serialization

At the beginning of a transaction the program runtime must prepare the
following which is shared by all instructions running programs suporting the
new ABI. This memory region starts at `0x400000000` and is readonly. It must be
updated as instructions through out the transaction modify the account metadata
or the scratchpad via `sol_set_return_data`.

- Key of the program which wrote to the scratchpad most recently: `[u8; 32]`
- The scratchpad data: `&[u8]` which is composed of:
- Pointer to scratchpad data: `u64`
- Length of scratchpad data: `u64`
- The number of transaction accounts: `u64`
- For each transaction account:
- Key: `[u8; 32]`
- Owner: `[u8; 32]`
- Lamports: `u64`
Comment on lines +91 to +94
Copy link
Copy Markdown
Contributor

@LucasSte LucasSte May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a couple of discussions with @Lichtso, we thought the feedback from developer relations would be important here.

Today programs can only see the accounts passed to them in the instruction being executed. This layout change entails that programs (and every CPIs program invoked from them) will now be able to access metadata from all the accounts passed in the transaction, regardless whether they were passed in the instruction or not. We still intend to maintain the account payload hidden, though, if it is not an instruction account.

Would this change have any unintended consequences on the developer side?

(cc. @joncinque and @jacobcreech )

- Account payload: `&[u8]` which is composed of:
- Pointer to account payload: `u64`
- Account payload length: `u64`
Comment on lines +92 to +97
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Programs also have access to the booleans writable, signer and executable. Are we serializing these ones as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are per instruction not per transaction. See the "flags bitfield" in "Per Instruction Serialization".

Comment on lines +96 to +97
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not mapping the payload for all accounts in the transaction, could we move the payload pointer and length to the instruction area? It would be a little inconvenient to offer developers a slice that causes an access violation when they try to access it, in the case of an account without its payload mapped to the virtual machine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would then require keeping the payload length field in sync in CPI call and return. That is what I would like to avoid.


A readonly memory region starting at `0x500000000` must be mapped in for the
scratchpad data. It must be updated when `sol_set_return_data` is called.

### Per Instruction Serialization

For each instruction the program runtime must prepare the following.
This memory region starts at `0x600000000` and is readonly. It does not require
any updates once serialized.

- The instruction data: `&[u8]` which is composed of:
- Pointer to instruction data: `u64`
- Length of instruction data: `u64`
- Programm account index in transaction: `u16`
- Number of instruction accounts: `u16`
- For each instruction account:
- Index to transaction account: `u16`
- Flags bitfield: `u16` (bit 0 is signer, bit 1 is writable)

### Per Instruction Mappings

A readonly memory region starting at `0x700000000` must be mapped
in for the instruction data. It too does not require any updates.

For each unique (meaning deduplicated) instruction account the payload must
be mapped in at `0x800000000` plus `0x100000000` times the index of the
Copy link
Copy Markdown
Contributor

@LucasSte LucasSte May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instruction data, the scratchpad, and the payload for each account reside in well-known addresses, but we are still serializing them in the instruction and the transaction area. How about we provide the transaction area address 0x400000000 and the instruction area address 0x600000000 in registers so that they would be arguments for the new entrypoint function in the SDK?

This setting would be more consistent with the approach not to hardcode the other addresses in the program SDK.

**transaction** account (not the index of the instruction account). Only if the
Comment on lines +131 to +133
Copy link
Copy Markdown

@alnoki alnoki Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lichtso (cc @febo @igor56D @jacobcreech @deanmlittle @arihantbansal)

per mtnDAO discussion 2026-02-26

TL;DR - non-deterministic account data addressing on a per-instruction basis is akin to a hard drive that starts up with randomized data pointers on every boot

E.g. if account data for account at index x in instruction y is non-deterministic, then direct pointer addressing in data structures breaks, introduction significant overhead for ABI v2.

For example the below binary search tree, which works with absolute addressing in v1, breaks in v2 (assuming non-deterministic instruction account payload addressing) and then has to use much more expensive offset calculations:

Implementation
#[repr(C, packed)]
/// Tree account data header. Contains pointer to tree root and top of free node stack.
pub struct TreeHeader {
    /// Aboslute pointer to tree root in memory map.
    pub root: *mut TreeNode,
    /// Absolute pointer to stack top in memory map.
    pub top: *mut StackNode,
    /// Absolute pointer to where the next node should be allocated in memory map.
    pub next: *mut TreeNode,
}

#[array_fields]
#[repr(C, packed)]
pub struct TreeNode {
    /// Absolute pointer to parent node in memory map.
    pub parent: *mut TreeNode,
    /// Absolute pointers to child nodes in memory map.
    pub child: [*mut TreeNode; tree::N_CHILDREN],
    pub key: u16,
    pub value: u16,
    pub color: Color,
}

#[repr(C, packed)]
/// Nodes removed from tree are pushed onto stack.
pub struct StackNode {
    pub next: *mut StackNode,
}

Suggested updates per discussion with @febo 2026-02-27:

  1. Densely packed account headers in read-only region at 0x500000000, laid out via instruction account index, without pointer to account region (40 bytes per instruction account)
  2. Account payloads deterministically translated on a per-instruction basis using instruction index, at 0x800000000 plus 0x100000000 times the index of the instruction account, containing:
    1. Owner
    2. Lamports (40 bytes up until this field, per instruction account)
    3. Data
  3. Account payloads are either read-only or writable depending on writable status of account in instruction

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the data structure you just mentioned, does it span across multiple accounts' payloads, or is it supposed to stay contained within a single account?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LucasSte it is within a single account, but the layout requires a deterministic ordering of accounts at the instruction level. E.g. [user_account, tree_account] for direct pointer addressing. This constraint is met by ABI v1

However in ABI v2 as currently written, instruction accounts are not deterministically laid out. E.g. if txn accounts are [tree_account, user_account], the layout is broken during a CPI to the program in question

@febo is also well aware of the problem and can probably help explain further on internal Anza channels

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ABIv0/v1 the pointers to instruction accounts (beyond the first) are not stable either because the caller can pick aliasing accounts which shifts everything by the one byte alias marker. Yes, you would probably abort in that case. Just saying we are not guaranteeing address stability.

@febo I don't see any mentions of instruction account aliasing in the suggestion, so I imagine you haven't tackled that problem yet.

Densely packed account headers in read-only region at 0x500000000 ...
deterministically translated on a per-instruction basis ...

This would bring us back to per instruction serialization, which we want to avoid entirely. The cost / complexity doesn't vanish if it is moved to the program runtime, then we still have to charge for it. Conceptually there are two paths:

  1. Do the maximal / worst-case thing in the program runtime and charge everybody for all and everything, even things they don't want / use.
  2. Do only what you need in the program, be charged for what you use. This is IBRL because less compute is wasted, the price reflects the actual resource usage closer and more transactions can be packed in the same time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the metadata of instruction accounts that (gathering them from transaction accounts) would be the same cost to do inside programs as it would be for the program runtime. For the payload it is a different story because the program can't remap that efficiently from the inside.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller can pick aliasing accounts

@Lichtso are you referring to the non-dup field from acct serialization? Yes it's easy enough to ensure addressable offsets in ABI v1 by just requiring NON_DUP_MARKER

And as far as the instruction serialization schema, I don't know if it is strictly necessary to re-serialize everything; the existing #### Instruction area section should already work fine except for the fat pointers to the non-deterministic payload area: in this case, what about simply shifting around the ### Accounts area addressing so that account payloads are in same order as instruction area?

This could be a simple offset applied to every store/load for the instruction, not dissimilar from translation already required for VM, and then saves a pointer in the InstructionAccount: less pointer loads as a result, and deterministic layout

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the core tenets behind ABIv2 was to reduce fees to a minimum. One way to achieve this is to share the same data structures between the validator and programs without any need to re-organize them. The validator loads accounts for the transaction and uses the account index in transaction for most operations, hence the idea to unify accounts around such an index.

Converting from index in instruction to index in transaction has a cost to the validator too. (1)

Densely packed account headers in read-only region at 0x500000000, laid out via instruction account index, without pointer to account region (40 bytes per instruction account)

Doing this requires sorting the accounts in a dense area for each top level instruction and twice for each CPI. (2)

what about simply shifting around the ### Accounts area addressing so that account payloads are in same order as instruction area?

This idea means that we cannot maintain the address space constant throughout the transaction. Consequently, we need to re-create it for each top-level instruction and twice for each CPI. (3)

Doing either (1), (2), (3), or any mix between them entails higher base costs and a possible cost per account in both top level and CPI instructions. That may offset any gains you might have from a predictable address space.

The question we should be discussing is whether it is worth adopting a suggestion that helps your use case, and potentially someone else's, while raising CU costs for everybody.

And as far as the instruction serialization schema, I don't know if it is strictly necessary to re-serialize everything; the existing #### Instruction area section should already work fine except for the fat pointers to the non-deterministic payload area: in this case, what about simply shifting around the ### Accounts area addressing so that account payloads are in same order as instruction area?

Another point this might bring confusion is the fact that the index of account in transaction would be used for accessing the account metadata and passed on to CPI, but the access of the account payload would have to use the index of the account in the instruction. I believed a unified index is more straightforward.

It is worth pointing too that the layout in this proposal obviates the syscall GetProcessedSiblingInstruction, since all instructions are provided, together with all the accounts metadata. Reordering accounts by index in instruction would need an effort to rethink this idea.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea that we discussed was that everything can stay as it is, but there is an extra mapping per instructions to access account payloads. Instead of having to calculate the address of the payload using the account index, there would be instruction specific addresses.

A simplistic view for this would be to map the payload of accounts to a new 0x990000... region (or any other that is available) and space them out by 10MiB. This way the payloads for instruction accounts are in a deterministic address based on their instruction index. Note that this does not mean to copy the content, just creating a mapping for a VM address that takes you directly to each account payload. Would this be feasible? And if yes, is it costly?

Copy link
Copy Markdown
Contributor Author

@Lichtso Lichtso Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be feasible?

Assuming you mean having two mappings for each account: One in order of the transaction, one in order of the instruction. Yes, it is easy to do in the program runtime, but it causes a different issue inside the program:

In Rust one can only track multiple aliasing references to the same address, but there is no concept of having multiple aliasing memory mappings (views) of the same underlying memory at different addresses. This would thus break borrow checking and pointer provenance rules if a program ever uses both. A way to circumvent this is by having a cfg feature which selects and only exposes one of the two in a SDK.

Edit: Thinking about it some more it wouldn't even work in Rust with the cfg feature, because the instruction account ordering is aliasing in itself. The instruction to transaction mapping does not just reorder but also deduplicate the mappings.

And if yes, is it costly?

After SIMD-0339 it is possible to pass in all transaction accounts in an instruction. Meaning, in the worst case, this would double the program runtime work of adjusting the memory mappings for each instruction.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be feasible?

Assuming you mean having two mappings for each account: One in order of the transaction, one in order of the instruction. Yes, it is easy to do in the program runtime, but it causes a different issue inside the program:

In Rust one can only track multiple aliasing references to the same address, but there is no concept of having multiple aliasing memory mappings (views) of the same underlying memory at different addresses. This would thus break borrow checking and pointer provenance rules if a program ever uses both

For high-perf programs that rely only on pointers, though, this wouldn't be an issue, and as high-perf methods become more dominant, predictable addressing ensures that foundational data structures work as expected without excessive pointer arithmetic

I think this secondary mapping is a useful idea, especially if it can be optimized to only do payloads for example

instruction account has the writable flag set and is owned by the current
program it is mapped in as a writable region. The writability of a region must
be updated as programs through out the transaction modify the account metadata.

### Lazy deserialization on the dApp side (inside the SDK)

With this design a program SDK can (but no longer needs to) eagerly deserialize
all account metadata at the entrypoint. Because this layout is strictly aligned
and uses proper arrays, it is possible to directly calculate the offset of a
single accounts metadata with only one indirect lookup and no need to scan all
preceeding metadata. This allows a program SDK to offer a lazy interface which
only interacts with the account metadata fields which are needed, only of the
accounts which are of interest and only when necessary.

### Changes to syscalls

The `AccountInfo` parameter of the CPI syscalls (`sol_invoke_signed_c` and
`sol_invoke_signed_rust`) will be ignored if ABI v2 is in use. Instead the
changes to account metadata will be communicated explicitly through separate
syscalls `sol_set_account_owner`, `sol_set_account_lamports` and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we need to mention the expected cost of sol_set_account_lamports to update lamports of an account – this is a quite common operation in programs.

`sol_set_account_length`. Each of these must take a guest pointer to the
structure of the transaction account (see per transaction serialization) to be
updated and the new value as second parameter. In case of the pubkey parameter
the guest pointer to a 32 byte slice is taken instead.

### Changes to CU metering

CPI will no longer charge CUs for the length of account payloads. Instead TBD
CUs will be charged for every instruction account.

## Impact

This change is expected to drastically reduce the CU costs as the cost will no
longer depend on the length of the instruction account payloads or instruction
data.

From the dApp devs perspective almost all changes are hidden inside the SDK.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we can hide the need for sol_set_account_lamports in the SDK. It might also be the case that CUs to update lamports will increase.


## Security Considerations

What security implications/considerations come with implementing this feature?
Are there any implementation-specific guidance or pitfalls?

## Drawbacks

This will require parallel code paths for serialization, deserialization, CPI
call edges and CPI return edges. All of these will coexist with the exisiting
ABI v0 and v1 for the forseeable future, until we decide to deprecate them.

## Backwards Compatibility

The magic field (`u32`) and version field (`u32`) of ABI v2 are placed at the
beginning, where ABI v0 and v1 would otherwise indicate the number of
instruction accounts as an `u64`. Because the older ABIs will never serialize
more than a few hundred accounts, it is possible to differentiate the ABI
that way without breaking the older layouts.
Loading