Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions docs/src/design/hubris-riscv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Porting Hubris OS to RISC-V

## Executive Summary

Porting Hubris OS to RISC-V would be **trivial** compared to porting most operating systems because Hubris was explicitly designed with architecture portability in mind. The documentation already includes RISC-V specifications, and the architecture-specific code is minimal and well-isolated.

## 🏗️ Architecture-Agnostic Design Philosophy

### Minimal Kernel Surface Area
Hubris follows a **microkernel philosophy** where the kernel does as little as possible:

- **Small syscall set**: Only 14 syscalls total.
- **Preemptive scheduling**: Simple priority-based scheduler.
- **Statically allocated**: All resources determined at compile time
- **Task isolation**: Memory protection via region-based MPU/PMP, not page tables

### Clean Architecture Abstraction

All architecture-specific code is isolated in a single module:

```rust
// sys/kern/src/arch.rs - Current structure
cfg_if::cfg_if! {
if #[cfg(target_arch = "arm")] {
pub mod arm_m;
pub use arm_m::*;
} else {
compile_error!("support for this architecture not implemented");
}
}
```

Adding RISC-V support requires only:

```rust
// Proposed addition
} else if #[cfg(target_arch = "riscv32")] {
pub mod riscv;
pub use riscv::*;
} else {
```

## What Already Exists for RISC-V

### Documentation References

The Hubris documentation **already includes RISC-V specifications**:

#### 1. **Syscall Interface** (syscalls.adoc:74-76)
```
=== RISC-V

Syscalls are invoked using the `ECALL` instruction. The rest is TBD.
```

#### 2. **Timer System** (timers.adoc:6)
```
silicon vendors -- the `SysTick` on ARM, the `mtimer` on RISC-V. Hubris provides
a multiplexer for this timer, so that each task appears to have its own.
```

#### 3. **Interrupt Handling** (interrupts.adoc:5-11)
```
Hubris port, but these ideas are intended to translate to RISC-V systems using
controllers like the PLIC.
```

### Architecture Requirements Already Defined

The documentation specifies what any architecture port needs:

1. **32-bit registers**: ✅ RISC-V32 matches
2. **Supervisor call instruction**: ✅ `ECALL` equivalent to ARM's `SVC`
3. **Memory protection**: ✅ RISC-V PMP equivalent to ARM MPU
4. **Standard timer**: ✅ `mtimer` equivalent to ARM `SysTick`
5. **Interrupt controller**: ✅ PLIC equivalent to ARM NVIC

## 🔧 Implementation Requirements (Minimal)

### Architecture Module (~2000 lines total)

Based on the existing ARM implementation (`sys/kern/src/arch/arm_m.rs` - 1901 lines):

| Component | Estimated Lines | Complexity | ARM Equivalent |
|-----------|----------------|------------|----------------|
| **Context switching** | ~300 | Medium | Save/restore `x1-x31` vs `r0-r15` |
| **Syscall entry** | ~200 | Low | `ECALL` handler vs `SVC` handler |
| **Timer integration** | ~100 | Low | `mtimer` vs `SysTick` |
| **Memory protection** | ~200 | Medium | PMP setup vs MPU setup |
| **Interrupt routing** | ~200 | Low | PLIC vs NVIC |
| **Task state management** | ~500 | Medium | TCB save/restore |
| **Boot sequence** | ~100 | Low | Reset handler |
| **Utilities/macros** | ~300 | Low | Architecture helpers |
| **Total** | **~1900** | **Low-Medium** | **Direct translation** |

### Register Mapping (Trivial)

**Current ARM Syscall Convention:**
- Arguments: `r4` through `r10` (7 args)
- Syscall number: `r11`
- Returns: `r4` through `r11` (8 returns)

**Proposed RISC-V Convention:**
- Arguments: `x10-x16` (`a0-a6`) (7 args)
- Syscall number: `x17` (`a7`)
- Returns: `x10-x17` (`a0-a7`) (8 returns)

### Core Functions to Implement

```rust
// Required architecture interface (based on ARM module)
pub fn apply_memory_protection(task: &Task) -> Result<(), FaultInfo>;
pub fn start_task(task: &Task) -> !;
pub fn save_task_state(task: &mut Task);
pub fn restore_task_state(task: &Task);
pub fn current_task_ptr() -> *const Task;
pub fn set_current_task_ptr(task: *const Task);
pub fn usermode_entry_point() -> u32;
pub fn get_task_dump_area() -> &'static mut [u8];
```

### ✅ What Makes Hubris Easy to Port

1. **🎯 Narrow target scope**: Only 32-bit microcontrollers
2. **📦 Rust ecosystem**: RISC-V already well-supported
3. **🔒 Memory safety**: Rust prevents most porting bugs
4. **⚡ Simple execution model**: Privileged kernel, unprivileged tasks
5. **🛡️ Minimal assembly**: Most code is portable Rust
6. **📚 Clear documentation**: Architecture requirements already specified

30 changes: 16 additions & 14 deletions docs/src/design/os-selection.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# OpenPRoT Operating System Selection

Platform root of trust (PRoT) implementations require an operating system that provides hardware-enforced memory isolation, deterministic behavior, and fault recovery without compromising system integrity.
Platform root of trust (PRoT) implementations require an operating system that provides hardware-enforced memory isolation, deterministic behavior, and fault recovery without compromising system integrity.

OpenPRoT addresses these requirements as an open-source, Rust-based platform that provides a secure foundation for platform security. The project offers a Hardware Abstraction Layer (HAL) and suite of services for device attestation, secure firmware updates, and modern security protocols (SPDM, MCTP, PLDM) [5].
OpenPRoT addresses these requirements as an open-source, Rust-based platform that provides a secure foundation for platform security. The project offers a Hardware Abstraction Layer (HAL) and suite of services for device attestation, secure firmware updates, and modern security protocols (SPDM, MCTP, PLDM) [5].

The OpenPRoT workgroup (hereafter "the workgroup") evaluated best-in-class Rust embedded OSes to identify the optimal operating system for this security-critical embedded platform.

Expand All @@ -13,30 +13,30 @@ This whitepaper documents the workgroup's evaluation process and technical ratio
Our evaluation framework assessed:

1. **Memory protection and isolation mechanisms** - Critical for security boundaries
2. **Fault tolerance and recovery capabilities** - Essential for system reliability
2. **Fault tolerance and recovery capabilities** - Essential for system reliability
3. **Static vs. dynamic system composition** - Impacts predictability and security
4. **System complexity and attack surface** - Affects long-term maintainability and security
5. **Preemptive scheduling and determinism** - Important for responsive system behavior
6. **Debuggability and system observability** - Critical for development, testing, and production monitoring

## Evaluation Criteria Details

**Memory Protection and Isolation Mechanisms**
**Memory Protection and Isolation Mechanisms**
PRoT requires strict separation between trusted and untrusted components. We evaluated how each OS enforces memory boundaries, prevents unauthorized access between tasks, and isolates drivers from the kernel. Hardware-enforced isolation (Memory Protection Unit - MPU) provides stronger guarantees than software-based partitioning.

**Fault Tolerance and Recovery Capabilities**
**Fault Tolerance and Recovery Capabilities**
Critical infrastructure cannot tolerate cascading failures. We assessed each system's ability to contain faults, restart failed components without affecting others, and maintain system integrity during partial failures. The ability to predict and bound failure modes is essential. Key requirements include in-place component reinitialization capabilities, supervisor-mediated fault recovery, and memory isolation to limit the "blast radius" of failures without requiring system-wide reboots.

**Static vs. Dynamic System Composition**
**Static vs. Dynamic System Composition**
Runtime flexibility introduces uncertainty in security-critical systems. We compared compile-time system definition (where all components and dependencies are known) against runtime component loading. Static composition enables better security analysis and eliminates entire classes of runtime failures. Key evaluation criteria included compile-time validation capabilities, build-time configuration verification, and the ability to detect resource conflicts and communication path errors before deployment.

**System Complexity and Attack Surface**
**System Complexity and Attack Surface**
PRoT systems have focused requirements that differ from general-purpose embedded applications. We evaluated how each OS architecture aligns with these specific security-critical needs. For platform root of trust implementations, features like dynamic application loading and runtime resource allocation provide valuable capabilities for flexible system deployment, though they introduce considerations around predictability and attack surface analysis.

**Preemptive Scheduling and Determinism**
**Preemptive Scheduling and Determinism**
Platform root of trust implementations require predictable response times for security-critical operations like cryptographic processing and attestation responses. We assessed each system's scheduling guarantees, priority handling, and ability to ensure high-priority security tasks can always preempt lower-priority work within bounded time.

**Debuggability and System Observability**
**Debuggability and System Observability**
Complex embedded systems require robust debugging and monitoring capabilities throughout development and deployment. We evaluated each system's approach to runtime inspection, system state visibility, and debugging infrastructure. Traditional console-based debugging introduces security vulnerabilities and code bloat, making kernel-aware debugging tools essential for production systems. The ability to observe system behavior without modifying application code or introducing runtime overhead is critical for security-sensitive platforms.

## Detailed Technical Analysis
Expand All @@ -48,6 +48,8 @@ Complex embedded systems require robust debugging and monitoring capabilities th
| **System Composition** | **Static**: All tasks defined at compile-time in app.toml configuration, cannot be created/destroyed at runtime. Build system validates all configurations with static assertions. Supports in-place task reinitialization for fault recovery - supervisor task can restart crashed tasks without system reboot. Design philosophy prioritizes eliminating functionality not essential for server management and platform security, resulting in a smaller, more focused codebase to audit and validate. | **Dynamic**: Tasks can be dynamically loaded and assigned. Offers flexibility for diverse application scenarios and runtime adaptation. | Static model with compile-time validation prevents entire classes of runtime failures. In-place restart capability enables component-level recovery, avoiding system-wide reboots for isolated faults. Dynamic models provide flexibility for applications requiring runtime component loading or updates. |
| **Communication** | **Strictly Synchronous**: IPC blocks sender until reply received. Uses rendezvous mechanism inspired by L4 microkernel - kernel performs direct memory copy between tasks, extending Rust's ownership model across task boundaries through leasing. | **Asynchronous**: Callback-based notifications for applications. | Synchronous communication eliminates race conditions, enables precise fault isolation (REPLY_FAULT at error point), and simplifies kernel design by avoiding complex message queue management. |
| **Fault Isolation** | **Disjoint Protection Domains**: Drivers and kernel in separate, MPU-enforced memory spaces. Failing driver cannot corrupt kernel. | **Shared Protection Domain**: Drivers run in same domain as kernel but are partitioned by Rust's type system and capsule architecture. Capsules are kernel modules that rely on Rust's memory safety (borrowing rules, lifetime management) and trait-based interfaces for isolation rather than hardware memory protection. | Hardware-enforced isolation provides robust defense against faults. Memory-safe languages alone don't prevent all failures in critical systems. |
| **Embedded CPU Architecture Support** | **ARM Cortex-M:** Official native support included.<br> **RISC-V** Designed with RISC-V in mind, but currently only has unnofficial support from outside developers including OpenPRoT partners. | **ARM Cortex-M:** Official native support included.<br> **RISC-V** Official native support included.<br> **x86 (32bit):** Official native support included. | While native support is desireable, Hubris is relatively trivial to port to additional architectures for these reasons:<br><br> 1. **🎯 Narrow target scope**: Only 32-bit microcontrollers<br> 2. **📦 Rust ecosystem**: RISC-V already well-supported<br> 3. **🔒 Memory safety**: Rust prevents most porting bugs<br> 4. **⚡ Simple execution model**: Privileged kernel, unprivileged tasks<br> 5. **🛡️ Minimal assembly**: Most code is portable Rust<br> 6. **📚 Clear documentation**: Architecture requirements already specified<br><br> [More details](./hubris-riscv.md) |
| **Licensing** | **Mozilla Public License Version 2.0**: Commercial use allowed, May be combined with proprietary code, Modified MPL files must be shared and remain MPL, Explicit patent grant included, Must retain copyright notices | **Apache License 2.0**: Commercial use allowed without restrictions, May be combined with proprietary code, Must state significant changes but not required to share, Explicit patent grant included, Must retain copyright notices | Both licenses allow for commercial use and mixing files with other licenses (including proprietary code). The primary difference is that any MPL licensed files must remain under the MPL license, and any changes to those files must be shared publicly. |

### Resource & Memory Management

Expand All @@ -70,23 +72,23 @@ Complex embedded systems require robust debugging and monitoring capabilities th

The analysis revealed that Hubris's microkernel architecture with MPU-enforced isolation and static task assignment better aligns with PRoT requirements than Tock's dynamic application model.

**Hubris's "Aggressively Static" Philosophy**
**Hubris's "Aggressively Static" Philosophy**
Hubris employs comprehensive compile-time validation through static assertions, moving error detection from runtime to build time [1]. All system configuration is declared in app.toml files, with the build system performing extensive checks on task priorities, resource requirements, and communication paths [2]. This approach makes entire classes of runtime failures impossible by construction - if a configuration would lead to resource exhaustion or invalid task communication, the build simply fails with a clear error message.

**Synchronous IPC Design for Robustness**
**Synchronous IPC Design for Robustness**
Hubris implements synchronous, message-based Inter-Process Communication inspired by L4 microkernel design [1]. The rendezvous mechanism operates like cross-task function calls: the sender blocks until the receiver processes the message and replies. This enables direct memory copying between tasks without intermediate queues, extends Rust's ownership model across task boundaries through memory leasing [6], and provides precise fault isolation - a buggy task can be terminated with REPLY_FAULT at the exact error point, preventing fault propagation.

**Component-Level Fault Recovery**
**Component-Level Fault Recovery**
Hubris enables recursive component-level restarts without system reboots through in-place task reinitialization [1]. When a task experiences a kernel-visible fault (memory access violation, panic), the kernel notifies a designated supervisor task, which can restart the failed task by resetting its registers, stack, and resource connections. Memory isolation limits the "blast radius" - corrupt state in one task cannot affect others. This allows individual driver crashes to be handled by restarting just the affected components rather than the entire system, critical for continuous operation in server infrastructure.

**Kernel-Aware Debugging Architecture**
**Kernel-Aware Debugging Architecture**
Hubris takes a unique approach to system debugging through its co-developed Humility debugger and Debug Binary Interface (DBI). Rather than implementing traditional console interfaces within applications, Hubris applications contain no printf-level formatting code, command parsing, or console interfaces. Instead, the external Humility debugger provides comprehensive system inspection capabilities through kernel-aware debugging protocols.

This architecture eliminates common security vulnerabilities associated with console interfaces - buffer overflows, format string vulnerabilities, and command injection attacks - while reducing application code size by removing formatting and parsing logic. The DBI allows applications to declare variables and types that the debugger can automatically discover and manipulate, providing superior observability without runtime overhead or security compromise.

Hubris includes comprehensive core dump support, enabling the capture of complete system snapshots into files for post-mortem analysis. These dumps can be loaded into Humility for offline debugging, allowing detailed investigation of system failures without requiring access to the live hardware. This capability proves particularly valuable for security-critical systems where traditional debugging interfaces would introduce unacceptable attack surface, enabling thorough failure analysis while maintaining production system security.

**Critical Architectural Differences**
**Critical Architectural Differences**
Key differentiators include Hubris's hardware-enforced memory boundaries, user-space driver architecture, and compile-time system composition versus Tock's software-based isolation for kernel drivers (capsules) [4] and runtime application loading. In Tock, capsules are kernel modules that share the same privilege level and address space as the kernel core, with isolation achieved through Rust's type system, borrowing checker, and carefully designed trait boundaries rather than hardware memory protection. Hubris eliminates dynamic memory allocation, task creation/destruction, and runtime resource management [2], while Tock maintains flexibility through grant-based dynamic allocation and runtime component loading [3,4].

These architectural differences have direct implications for security guarantees, system predictability, and fault containment in PRoT applications.
Expand Down
Loading