Skip to content

Conversation

@nwf
Copy link
Member

@nwf nwf commented Jun 11, 2025

This implements David's proposal at
#100 (comment)

The RTOS test suite passes, but further review is probably wise.

This is not so much intended for immediate merge or propagation to existing CHERIoT targets, but to reduce our divergence, if we can, against the RISC-V Y draft.

This implements David's proposal at
#100 (comment)

The RTOS test suite passes, but further review is probably wise.
@nwf nwf requested review from davidchisnall, rmn30 and vmurali June 11, 2025 16:50
@nwf nwf added the maybe-v2 Tracking issues for possible changes for an ISAv2 label Jun 11, 2025
@vmurali
Copy link
Collaborator

vmurali commented Jun 21, 2025

Background on Backward Sentries

The initial sentry was created to specify the interrupt disposition of the callee - enabling/disabling/inheriting. The backward sentry was created to avoid a potential availability attack by a malicious caller by calling into an interrupt-disabling library: the malicious caller can set up the return address to be the same as the library call's entry point (which is an interrupt-disabling sentry), leading to that non-malicious library being called repeatedly by the malicious caller without re-enabling interrupts.

(An aside: Given that sentries are possible entry points, a backward sentry is especially critical: it is created dynamically and produces an arbitrary entry points which are not specified in the export tables. So, if a malicious compartment gets hold of a backwards sentry, it can keep attacking the target of the sentry, forcing it to handle a return when it's not expecting. That's one of the reasons for initially forcing a backwards sentry to be Local so that it cannot be stashed in global memory indefinitely - but that proved to be cumbersome when the stack is in the heap, preventing any spill of backwards sentries.)


Background on Tail call

A tail call X doesn't create a call frame, and instead re-uses an existing return address to its caller that its callee can use to directly return to its caller skipping X. The interrupt disposition of the caller of X should be restored on the final return. The callee's interrupt status can potentially be different from that of X which can be different from that of the caller of X.


Background on Outlining

Outlining is the opposite of inlining: you create functions to capture repeated code snippets. In CHERIoT, these outlined functions can potentially be implemented (not sure if it's done currently) as common libraries shared across multiple compartments. Outlined functions necessarily have the same interrupt disposition as that of its invocation, i.e. the caller of the outlined function (as one cannot change the interrupt status in the middle of a call). And to avoid register spills, the outline function's caller's return address register should not be reused for the outline function's return. In particular, if cra is used for the caller's return (which is the common case), the caller cannot use cra as the link address (i.e. destination register) for the outline function invocation.


Ideal solution

Let's say we have different instructions to convey different jump types. We can also discuss if they can all be collapsed into one instruction. Here are the main types:

cjal{r} src:unsealed, dst:any: Intra-compartment, non-library call. Produces an unsealed cap in dst if dst $\not=$ $czero. This can also create a backward sentry to avoid buggy callees which inadvertently change the return address. That's what we do currently in CHERIoT. That's what the RISCV proposal in #100 (comment) also does. This is in fact the reason for the current proposal #101 - to ensure that a sentry is created in dst rather than keeping it unsealed.

call src:forward any, dst:any: A call with dst potentially provided for return. If dst $\not=$ $czero, create a backward sentry in dst with the interrupt disposition of the caller. If dst $=$ $czero, then it can be a tail call where the return address is already set up (most likely in $cra)

ret src:backward any, dst:$czero: A return using a backward sentry. The interrupt disposition is explicitly set by an earlier call (of the caller or its ancestors) which created this backward sentry, or by a cjalr unsealed if we had created a backwards sentry for unsealed cjalr's.

This would work because call and ret are disambiguated using the instruction itself. cjalr can share the same opcode with either call or ret because cjalr has an unsealed src.

This is the ideal solution but requires a new instruction.


Forcing same opcode for different jumps

If we are forced to use the same opcode for call and ret, then we have to somehow disambiguate between a call with dst $=$ $czero (that's used in a tail call) that requires a forward sentry and ret that requires a backward sentry. If we conflate forward and backward sentry, we go back to the old problem of a malicious compartment setting the return address of a non-interruptible library function using a forward sentry to the same library function.


The proposed solution for forcing same opcode for different jumps (#100 )

Create a backwards inheriting sentry that is always created for cjal{r} src:unsealed as well as for call src dst when dst $\notin$ {$czero, $cra} (which necessarily means src is a forward inheriting sentry). So the backwards inheriting sentry can be created only if the caller has the same interrupt disposition as the callee. This means it still does not allow a malicious caller to create a repeated call to an non-interruptible library function.

Such a backwards inheriting sentry can also be used for a ret src when src $\not=$ $cra, and hence allows the same cjalr opcode to be used in both call and ret.


I think this solution works, but it might be better overall to spare another opcode for disambiguating this instead of this complex logic and analysis. It would make the overall exception pipeline simpler and would remove all restrictions on the addresses used for src and dst. It's actually not even that bad in terms of the opcode space - there's a funct3 field which is 0 for cjalr. That can be made non-zero for call or ret.


I think there's actually no need to seal the return address for unsealed forward address because it necessarily denotes the same compartment (buggy code be damned), making the whole proposal moot. Instead, sealed addresses should only be used for library/cross-compartment calls or returns. But I still think, even if the return addresses are not sealed, one should have different instructions to disambiguate calls and returns that use sentries

Copy link
Collaborator

@vmurali vmurali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added my review in the conversation

@vmurali
Copy link
Collaborator

vmurali commented Jun 25, 2025

@davidchisnall @nwf , what are your objections to a new "Ret" instruction other than RISC-V doesn't have it? Does this free up arbitrary restrictions on the registers passed to tailcalls/outlining calls? Or does it not solve the ambiguity anyway?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maybe-v2 Tracking issues for possible changes for an ISAv2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants