-
Notifications
You must be signed in to change notification settings - Fork 284
SIMD-0177: Program Runtime ABI v2 #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e3e3e19
39486ca
75cb2fd
d6d9221
6fdcac4
18c05b1
8c9336e
0b21855
aa600a7
5053e9a
4d65a18
fab4724
0aee740
b849c2d
c022960
c3303db
a94b38c
2b64135
ccec371
47cb8f3
ebe6e36
47f1435
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,318 @@ | ||
| --- | ||
| simd: '0177' | ||
| title: Program Runtime ABI v2 | ||
| authors: | ||
| - Alexander Meißner | ||
| - Lucas Stuernagel | ||
| category: Standard | ||
| type: Core | ||
| status: Idea | ||
| created: 2025-02-23 | ||
| feature: TBD | ||
| extends: SIMD-0219 | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| Align the layout of the virtual address space to large pages in order to | ||
| simplify the address translation logic and allow for easy direct mapping. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Direct mapping of the account payload data is enabled by SIMD-0219. | ||
| However, there remains a big optimization potential for both programs and the | ||
| program runtime: | ||
|
|
||
| - Instruction data could be mapped directly as well | ||
| - Return data could be mapped directly too | ||
| - Account payload could be resized freely (no more 10 KiB growth limit) | ||
| - CPI could become cheaper in terms of CU consumption | ||
| - Most structures could be shared between programs and program runtime, | ||
| requiring only a single serialization at the beginning of a transaction and | ||
| only small adjustments after | ||
| - Per instruction serialization before a program runs could be removed entriely | ||
| - Per instruction deserialization after a program runs could be removed too | ||
| - Deserialization inside the dApp could be reduced to a minimum | ||
| - programs would only have to pay for what they use, not having to deserialize | ||
| all instruction accounts which were passed in | ||
| - Scanning sibling instructions would not require a syscall | ||
| - Memory regions (and thus address translation) which SIMD-0219 made unaligned | ||
| could be aligned (to 4 GiB) again | ||
|
|
||
| All of these however do necessitate a major change in the layout how the | ||
| program runtime and programs interface (ABI). | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| None. | ||
|
|
||
| ## New Terminology | ||
|
|
||
| None. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| Programs signal that they expect ABIv2 through their SBPF version field being | ||
| v4 or above. | ||
|
|
||
| ### Memory Regions | ||
|
|
||
| #### Transaction metadata area | ||
|
|
||
| At the beginning of a transaction the program runtime must prepare a | ||
| readonly memory region starting at `0x400000000`. This region is shared by all | ||
| instructions running programs with support to new ABI. It must be updated as | ||
| as instructions through out the transaction modify the CPI scratchpad or the | ||
| return data. The contents of this memory region are the following: | ||
|
|
||
| - Key of the program which wrote to the return-data scratchpad most | ||
| recently: `[u8; 32]` | ||
| - The return-data scratchpad: `&[u8]`, which is composed of: | ||
| - Pointer to return-data scratchpad: `u64` | ||
| - Length of return-data scratchpad: `u64` | ||
| - The CPI scratchpad: `&[u8]`, which consists of: | ||
| - Pointer to CPI scratchpad: `u64` | ||
| - Length of CPI scratchpad: `u64` | ||
| - Index of current executing instruction: `u32` | ||
| - Total number of instructions in transaction (including CPIs and top level | ||
| instructions): `u32` | ||
| - Number of CPIs in trace (under execution and finished): `u32` | ||
| - The number of transaction accounts: `u32` | ||
|
|
||
|
|
||
| #### Account metadata area | ||
|
|
||
| This region starts at `0x500000000`, is readonly and holds the metadata for | ||
| all accounts in the transaction. It is shared by all instructions running | ||
| programs with support for ABIv2, and must be updated as instruction modify the | ||
| metadata with the provided syscalls (see the `Changes to syscalls` section). | ||
| The contents for this region are as follow: | ||
|
|
||
| - For each transaction account: | ||
| - Key: `[u8; 32]` | ||
| - Owner: `[u8; 32]` | ||
| - Lamports: `u64` | ||
|
Comment on lines
+91
to
+94
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After a couple of discussions with @Lichtso, we thought the feedback from developer relations would be important here. Today programs can only see the accounts passed to them in the instruction being executed. This layout change entails that programs (and every CPIs program invoked from them) will now be able to access metadata from all the accounts passed in the transaction, regardless whether they were passed in the instruction or not. We still intend to maintain the account payload hidden, though, if it is not an instruction account. Would this change have any unintended consequences on the developer side? (cc. @joncinque and @jacobcreech ) |
||
| - Account payload: `&[u8]` which is composed of: | ||
| - Pointer to account payload: `u64` | ||
| - Account payload length: `u64` | ||
|
Comment on lines
+92
to
+97
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Programs also have access to the booleans writable, signer and executable. Are we serializing these ones as well?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are per instruction not per transaction. See the "flags bitfield" in "Per Instruction Serialization".
Comment on lines
+96
to
+97
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we are not mapping the payload for all accounts in the transaction, could we move the payload pointer and length to the instruction area? It would be a little inconvenient to offer developers a slice that causes an access violation when they try to access it, in the case of an account without its payload mapped to the virtual machine.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would then require keeping the payload length field in sync in CPI call and return. That is what I would like to avoid. |
||
|
|
||
| #### Instruction area | ||
|
|
||
| For each transaction, the program runtime must also prepare two memory regions. | ||
| The first one is a readonly region starting at `0x600000000`. It must be | ||
| updated at each CPI call edge. The contents of this region are the following: | ||
|
|
||
| - For each instruction in transaction: | ||
LucasSte marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fixed in 0aee740 |
||
| - Reserved filed for alignment and potential future usage: `u16` | ||
| - Index in transaction of program account to be executed: `u16` | ||
| - CPI nesting level: `u16` | ||
| - Index of parent instruction (`u32::MAX` for top-level instructions): `u16` | ||
| - Reference to a slice of instruction accounts `&[InstructionAccount]`, | ||
| consisting of: | ||
| - Pointer to slice: `u64` | ||
| - Number of elements in slice: `u64` | ||
| - Instruction data `&[u8]`, which is composed of: | ||
| - Pointer to data: `u64` | ||
| - Length of data: `u64` | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be useful for programs to also have the instruction account deduplication map available to them here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I initially devised the deduplication map (i.e. a map from index_in_transaction to index_in_instruction) to be runtime only, since that was the only place we cared about account deduplication. Today the SDK adds duplicate accounts to the AccountInfo array, so I'm not sure programs distinguish them. Maybe @febo can have a broader opinion.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understood correctly, the current behaviour will still exist. More specifically, when it says: This slice will contain all instruction accounts – e.g., if my instruction expects 2 accounts and both are set to be the same, the program will still received a slice of length If that is the case, then access to the deduplication map is not necessary since the slice of accounts would indirectly represent this map right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it is true that the deduplication map is kinda redundant, however it takes time to calculate and we memorize it anyway, so we might as well give it to programs too.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree with this, if it is something that could be available for "free" to programs, we should include it. There might be an use case where you want to pin-point whether an account is duplicated or not without having to do multiple comparisons. |
||
|
|
||
| Let `InstructionAccount` contain the following fields: | ||
|
|
||
| - Index to transaction account: `u16` | ||
| - Signer flag: `u8` (1 for signer, 0 for non-singer) | ||
| - Writable flag: `u8` (1 for writable, 0 for readonly) | ||
|
|
||
| #### Return data scratchpad | ||
|
|
||
| A writable memory region starting at `0x700000000` must be mapped in for the | ||
| return-data scratchpad. | ||
|
|
||
| ### Accounts area | ||
|
|
||
| For each unique (meaning deduplicated) instruction account the payload must | ||
| be mapped in at `0x800000000` plus `0x100000000` times the index of the | ||
| **transaction** account (not the index of the instruction account). Only if the | ||
|
Comment on lines
+131
to
+133
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Lichtso (cc @febo @igor56D @jacobcreech @deanmlittle @arihantbansal) per mtnDAO discussion 2026-02-26 TL;DR - non-deterministic account data addressing on a per-instruction basis is akin to a hard drive that starts up with randomized data pointers on every boot E.g. if account data for account at index x in instruction y is non-deterministic, then direct pointer addressing in data structures breaks, introduction significant overhead for ABI v2. For example the below binary search tree, which works with absolute addressing in v1, breaks in v2 (assuming non-deterministic instruction account payload addressing) and then has to use much more expensive offset calculations: Implementation#[repr(C, packed)]
/// Tree account data header. Contains pointer to tree root and top of free node stack.
pub struct TreeHeader {
/// Aboslute pointer to tree root in memory map.
pub root: *mut TreeNode,
/// Absolute pointer to stack top in memory map.
pub top: *mut StackNode,
/// Absolute pointer to where the next node should be allocated in memory map.
pub next: *mut TreeNode,
}
#[array_fields]
#[repr(C, packed)]
pub struct TreeNode {
/// Absolute pointer to parent node in memory map.
pub parent: *mut TreeNode,
/// Absolute pointers to child nodes in memory map.
pub child: [*mut TreeNode; tree::N_CHILDREN],
pub key: u16,
pub value: u16,
pub color: Color,
}
#[repr(C, packed)]
/// Nodes removed from tree are pushed onto stack.
pub struct StackNode {
pub next: *mut StackNode,
}Suggested updates per discussion with @febo 2026-02-27:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. About the data structure you just mentioned, does it span across multiple accounts' payloads, or is it supposed to stay contained within a single account? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LucasSte it is within a single account, but the layout requires a deterministic ordering of accounts at the instruction level. E.g. However in ABI v2 as currently written, instruction accounts are not deterministically laid out. E.g. if txn accounts are @febo is also well aware of the problem and can probably help explain further on internal Anza channels
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In ABIv0/v1 the pointers to instruction accounts (beyond the first) are not stable either because the caller can pick aliasing accounts which shifts everything by the one byte alias marker. Yes, you would probably abort in that case. Just saying we are not guaranteeing address stability. @febo I don't see any mentions of instruction account aliasing in the suggestion, so I imagine you haven't tackled that problem yet.
This would bring us back to per instruction serialization, which we want to avoid entirely. The cost / complexity doesn't vanish if it is moved to the program runtime, then we still have to charge for it. Conceptually there are two paths:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the metadata of instruction accounts that (gathering them from transaction accounts) would be the same cost to do inside programs as it would be for the program runtime. For the payload it is a different story because the program can't remap that efficiently from the inside. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@Lichtso are you referring to the non-dup field from acct serialization? Yes it's easy enough to ensure addressable offsets in ABI v1 by just requiring And as far as the instruction serialization schema, I don't know if it is strictly necessary to re-serialize everything; the existing This could be a simple offset applied to every store/load for the instruction, not dissimilar from translation already required for VM, and then saves a pointer in the InstructionAccount: less pointer loads as a result, and deterministic layout
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of the core tenets behind ABIv2 was to reduce fees to a minimum. One way to achieve this is to share the same data structures between the validator and programs without any need to re-organize them. The validator loads accounts for the transaction and uses the account index in transaction for most operations, hence the idea to unify accounts around such an index. Converting from index in instruction to index in transaction has a cost to the validator too. (1)
Doing this requires sorting the accounts in a dense area for each top level instruction and twice for each CPI. (2)
This idea means that we cannot maintain the address space constant throughout the transaction. Consequently, we need to re-create it for each top-level instruction and twice for each CPI. (3) Doing either (1), (2), (3), or any mix between them entails higher base costs and a possible cost per account in both top level and CPI instructions. That may offset any gains you might have from a predictable address space. The question we should be discussing is whether it is worth adopting a suggestion that helps your use case, and potentially someone else's, while raising CU costs for everybody.
Another point this might bring confusion is the fact that the index of account in transaction would be used for accessing the account metadata and passed on to CPI, but the access of the account payload would have to use the index of the account in the instruction. I believed a unified index is more straightforward. It is worth pointing too that the layout in this proposal obviates the syscall
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One idea that we discussed was that everything can stay as it is, but there is an extra mapping per instructions to access account payloads. Instead of having to calculate the address of the payload using the account index, there would be instruction specific addresses. A simplistic view for this would be to map the payload of accounts to a new
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Assuming you mean having two mappings for each account: One in order of the transaction, one in order of the instruction. Yes, it is easy to do in the program runtime, but it causes a different issue inside the program: In Rust one can only track multiple aliasing references to the same address, but there is no concept of having multiple aliasing memory mappings (views) of the same underlying memory at different addresses. This would thus break borrow checking and pointer provenance rules if a program ever uses both. A way to circumvent this is by having a cfg feature which selects and only exposes one of the two in a SDK. Edit: Thinking about it some more it wouldn't even work in Rust with the cfg feature, because the instruction account ordering is aliasing in itself. The instruction to transaction mapping does not just reorder but also deduplicate the mappings.
After SIMD-0339 it is possible to pass in all transaction accounts in an instruction. Meaning, in the worst case, this would double the program runtime work of adjusting the memory mappings for each instruction. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
For high-perf programs that rely only on pointers, though, this wouldn't be an issue, and as high-perf methods become more dominant, predictable addressing ensures that foundational data structures work as expected without excessive pointer arithmetic I think this secondary mapping is a useful idea, especially if it can be optimized to only do payloads for example |
||
| instruction account has the writable flag set and is owned by the current | ||
| program it is mapped in as a writable region. The writability of a region must | ||
| be updated as programs through out the transaction modify the account metadata. | ||
|
|
||
| The runtime must only map the payload for accounts that belong in the current | ||
| executing instruction. The payload for accounts belonging to sibling instructions | ||
| must NOT be mapped. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be easier to always map in all accounts which are not referenced in an instruction as readonly. That way we wouldn't even have to hide / reveal them on every instruction, thus it is less work for the validator and more available data for the programs. Also, we already load all sysvar accounts, might as well expose them here too. That would however either rise the maximum transaction account number beyond 255 or require a new range of transaction accounts, but that is harder to pull of because of possible aliasing with sysvars which were mentioned in the message. |
||
|
|
||
| ### Instruction payload area | ||
|
|
||
| For each instruction, the runtime must map its payload at address | ||
| `0x10800000000` plus `0x100000000` times the index of the instruction in the | ||
| transaction. All instruction payload mappings are readonly. | ||
|
|
||
| One extra writable mapping must be created after the last instruction payload | ||
| area to be the CPI scratch pad, i.e. at address `0x10800000000` plus | ||
| `0x100000000` times the number of instructions in the transaction. Its purpose | ||
| is for programs to write CPI instruction data directly to it and avoid copies. | ||
|
|
||
| ### Instruction accounts area | ||
|
|
||
| For each instruction, the runtime must map an array of `InstructionAccount` | ||
| (as previously defined) at address `0x14800000000` plus `0x100000000` times | ||
| the index of the instruction in the transaction. This mapped are is readonly. | ||
|
|
||
| Each of these memory regions contain the following for each instruction: | ||
|
|
||
| - For each account in instruction: | ||
| - `InstructionAccount`, consisting of: | ||
| - Index to transaction account: `u16` | ||
| - Signer flag: `u8` (1 for signer, 0 for non-singer) | ||
| - Writable flag: `u8` (1 for writable, 0 for readonly) | ||
|
|
||
| ### Sysvar accounts area | ||
|
|
||
| For each existing (non deprecated) sysvar account, the runtime must map its | ||
| payload at address `0x18800000000` plus `0x100000000` times the index of the | ||
| sysvar in the following order: | ||
|
|
||
| 0. Clock | ||
| 1. Epoch rewards | ||
| 2. Epoch Schdule | ||
| 3. Last restart slot | ||
| 4. Rent | ||
| 5. Slot hashes | ||
| 6. Stake history | ||
|
|
||
| ### VM initialization | ||
|
|
||
| During the initilization of the virtual machine, the runtime must load the | ||
| following values to registers: | ||
|
|
||
| 1. Register R1: A pointer to the metadata of the instruction under execution. | ||
| (see section [Instruction area](#instruction-area)). | ||
| 2. Register R2: A pointer to the instruction accounts slice for the | ||
| instruction under execution (see section | ||
| [Instruction accounts area](#instruction-accounts-area)). | ||
| 3. Register R3: The number of instruction accounts for the instruction under | ||
| execution. | ||
|
|
||
|
|
||
| ### Changes to syscalls | ||
|
|
||
| Changes to the account metadata must now be communicated with specific | ||
| syscalls, as detailed below: | ||
|
|
||
| - `sol_assign_owner`: Dst account, new owner as `&[u8; 32]` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sys-call makes the owner field volitile and prevents references to it from a program writing perspective. Still working on a way around this, but may require changing the signature to take in a pointer to the owner field to make it non-volitile (cpi still has issues here though).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is a relevant point. I think we can change it to be a pointer yes. Regarding
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, maybe @Buzzec is referring to the fact that transaction accounts have an owner field as Maybe we can have the
We could wrap a raw pointer to the owner in a type with interior mutability, so programs could still use references to owner. The above methods then would be added to this type. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking through this and I think the only solution due to the volatile memory is to make a comparison syscall as @febo described. Otherwise there will always have to be a copy of the owner field. The other method would be to change the api to be non-voltile on the owner field and owner updates must be obtained from another syscall.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Actually, we have a syscall for that: The problem is that for it to be cost effective, the CU costs of memcmp would need to be adjusted, otherwise manually loading and comparing would be cheaper. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm personally leaning towards making it a non-volitile field and having a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @febo cc @igor56D @jacobcreech per 2026-02-25 mtnDAO discussion: To maintain CU parity with ABI v1, r3 := lamports to transfer # calculated separately
# load, increment, store lamports for recipient
ldxdw r2, [r1 + ACCT_TO_INCREMENT_LAMPORTS_OFF]
add64 r2, r3
stxdw [r1 + ACCT_TO_INCREMENT_LAMPORTS_OFF], r2
# repeat for decrement to sender, 3 more CUs |
||
| - `sol_transfer_lamports`: Dst account, src account, amount as `u64` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wonder if having "transfer" on the name here could create confusion with "transfer" in the system program – most likely not, but we could consider naming this as
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does the same thing as "Transfer" of the system program. But there is no risk of accidentally using the wrong one, because this one only works for accounts owned by the current program and the system program one only works for accounts owned by the system program.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was just thinking in a different name to emphasize that one is "lower" level than the other – e.g., the system program could implement its "Transfer" instruction using the |
||
|
|
||
| The account parameters are the index of the account in the transaction. | ||
|
|
||
| Changes to the account payload length and all the scratchpads sections | ||
| introduced in this SIMD (the return-data scratchpad and the CPI scratchpad) | ||
| must be communicated via a new sycall `set_buffer_length`, with the following | ||
LucasSte marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| parameters: | ||
|
|
||
| - Base address of region to be resized: `u64` | ||
| - New length of region: `u64` | ||
|
|
||
| The syscall must check if the address matches the base address of either a | ||
| writable account payload mapping or one of the scratchpad mappings and return | ||
| an error otherwise. Constrains for the maximum resizable limits must also be | ||
| verified for each region separetely. | ||
|
|
||
| The `set_buffer_length` must charge a base cost (to be determined) plus the | ||
| same CU per byte ratio as the `memset` syscall. | ||
|
|
||
| The verifier must reject SBPFv4 programs containing the `sol_invoke_signed_c` | ||
| and `sol_invoke_signed_rust`, since they are not compatible with ABIv2. A new | ||
| syscall `sol_invoke_signed_v2` must replace them. The parameters for | ||
| `sol_invoke_signed_v2` are the following: | ||
|
|
||
| - Index in transaction of program ID to be called: `u64`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be a reference to the program id, as it is now? Otherwise it seems that a program will need to lookup the index for the callee program. Or is this index easily accessible?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Programs will have access to an array containing all accounts in the transaction. They will be aware of the index of the accounts the instruction is referring to. Having said that, the index is easily accessible and easier to manage. |
||
| - A pointer to a slice `&[InstructionAccount]`, with each element | ||
| `InstructionAccount` | ||
| containing, as previously mentioned: | ||
| - Index to transaction account: `u16` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, could this be a reference to the account address to avoid the index lookup, as it is now? Or is this index easily accessible?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indices are cheaper to compare and avoid a key search in the program runtime during CPI. Thus we want to avoid using
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, totally agree with that. Just checking that we are not pushing this "cost" to the program. It seems that we are going to be ok, since the index is available on the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, you can only CPI a program if you pass on its account in the transaction. |
||
| - Signer flag: `u8` (1 for signer, 0 for non-singer) | ||
| - Writable flag: `u8` (1 for writable, 0 for readonly) | ||
| - The length of the `&[InstructionAccount]` slice. | ||
| - A pointer to the singer seeds of type `&[&[&[u8]]]`. | ||
| - The length of the outer signer seeds slice in `&[&[&[u8]]]`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LucasSte If this will work as the current
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was trying to maintain compatibility with
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm, I see. The issue is that Rust slices don't have a stable layout so not sure if this can end up being a problem later. Maybe we should go with how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice if we could remove these fields all together with whole transaction/instruction signing for PDAs |
||
|
|
||
| Programs using `sol_get_return_data` and `sol_set_return_data` must be | ||
| rejected by the verfier if ABI v2 is in use. | ||
|
|
||
| ### Scratchpad management | ||
|
|
||
| This SIMD introduces two scratch pad regions: the return-data scratchpad and | ||
| the CPI scratchpad. At the beginning of every instruction, these scratchpads | ||
| must be empty and their size must be zero. | ||
|
|
||
| Programs must set the desired length for them using the `set_buffer_length` | ||
| syscall. Reads and writes to a region beyond the scratchpad length must | ||
| trigger an access violation error. | ||
|
|
||
| The management for the writable accounts payload must work similarly, except | ||
| that they must not be initialized empty, but instead with the pre-existing | ||
| data it holds. | ||
|
|
||
| ### CPIs | ||
|
|
||
| With ABIv2 and the new `sol_invoke_signed_v2` syscall, CPIs must be managed | ||
| differently. At each CPI call, the runtime must perform the following actions: | ||
|
|
||
| 1. Verify that all account indexes received in the `InstructionAccount` array | ||
| belong in the current executing instruction. Likewise, the prgram ID index | ||
| that should be called must also undergo the same verification. | ||
| 2. Append the slice `&[InstructionAccount]` passed as a parameter to the | ||
| array kept at address `0x700000000`. | ||
| 3. Append a new instruction at the end of the serialization array kept at | ||
| `0x600000000`. | ||
| 4. Transform the caller CPI scratchpad into a readonly instruction payload | ||
| region visible for the callee. | ||
| 5. Change the visibility and write permissions for the account payload | ||
| regions, according to the CPI accounts and their flags. | ||
| 6. Update the address for the callee CPI scratchpad, the index of current | ||
| executing transaction, and the number of instructions in transaction at | ||
| address `0x400000000`. | ||
|
|
||
| When the CPI returns, the runtime must do the following: | ||
|
|
||
| 1. Update the address for the CPI scratchpad, and keep the previouly used one | ||
| in its exsiting address assigned during CPI call. The new CPI scratchpad | ||
| address is the same as the previous one plus `0x100000000`. | ||
| 2. Change the read and write permission for the account payload regions, | ||
| according to potential changes in account ownership. | ||
| 3. Update the index of current executing instruction. | ||
| 4. No changes must be done in addresses `0x600000000` and `0x700000000`. | ||
|
|
||
| ### Changes to CU metering | ||
|
|
||
| CPI will no longer charge CUs for the length of account payloads. Instead TBD | ||
| CUs will be charged for every instruction account. Also TBD CUs will be charged | ||
| for the three new account metadata updating syscalls. TBD will be charged for | ||
| resizing a scratchpad. | ||
|
|
||
| ### Lazy deserialization on the dApp side (inside the SDK) | ||
|
|
||
| With this design a program SDK can (but no longer needs to) eagerly deserialize | ||
| all account metadata at the entrypoint. Because this layout is strictly aligned | ||
| and uses proper arrays, it is possible to directly calculate the offset of a | ||
| single accounts metadata with only one indirect lookup and no need to scan all | ||
| preceeding metadata. This allows a program SDK to offer a lazy interface which | ||
| only interacts with the account metadata fields which are needed, only of the | ||
| accounts which are of interest and only when necessary. | ||
|
|
||
| ## Impact | ||
|
|
||
| This change is expected to drastically reduce the CU costs as the cost will no | ||
| longer depend on the length of the instruction account payloads or instruction | ||
| data. | ||
|
|
||
| From the dApp devs perspective almost all changes are hidden inside the SDK. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure we can hide the need for |
||
|
|
||
| ## Security Considerations | ||
|
|
||
| What security implications/considerations come with implementing this feature? | ||
| Are there any implementation-specific guidance or pitfalls? | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| This will require parallel code paths for serialization, deserialization, CPI | ||
| call edges and CPI return edges. All of these will coexist with the exisiting | ||
| ABI v0 and v1 for the forseeable future, until we decide to deprecate them. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this member. Would it be helpful if it were instead a
VmSlice<InstructionFrame>?