Software Transactional Memory #307

felsweg-iota · 2022-01-26T17:18:32Z

felsweg-iota
Jan 26, 2022

Feature Name: (transactional memory / objects, software transactional memory)
Start Date: (2021-12-06)

Summary

Replace actix with a software transactional memory (STM) based system. Keeping concurrent calls and atomic transactions transparent to the user.

This RFC shall give insight on

how a software transactional memory system works,
why a software transactional memory system is a suitable replacement for the current actor system in Stronghold,
how a software transactional memory could be implemented and replace parts of the internal architecture of Stronghold, reduce coupling among parts.

Motivation

Removal of Actix as actor system

Stronghold employs actix as actor framework to manage concurrent operations inside the underlying system. While an actor system is not a bad choice, as it abstracts away difficult synchronization mechanisms, actix explicitly takes ownership of the underlying executor framework, which in turn makes it hard to integrate Stronghold in a shared context. Furthermore actix runs on a single threaded event loop, that renders actor isolation per thread obsolete.

Advantages over actor systems

In an STM based system all objects having mutable state are transactional, the behavior on the objects are transparently using the underlying system. This allows to isolate guarded memory from being exposed at runtime. Transactions are always safe; internal conflicts are rolled back automatically and retried until the transaction succeeds. Operations on mutable memory are composable.
Since only write operations actually change the object, this operation must be communicated to the other threads. Recent work describes multiple approaches, where we consider blocking / retrying other transaction the most viable approach to ensure data consistency. if the transaction has been finished, while having all read operations validated, the resulting work is committed to the actual object. Other threads operating on the same object, never see the changes done to this object. The STM described here, uses a lazy approach in rolling back a transaction An STM can also be described as an optimistic locking approach: Work on memory is considered safe until a conflict occurs, the transaction can be safely rolled back and retried.

Apriori Problems with Composable Transaction

Every transaction are only allowed to non-irrevocably operations

fn irrevocably() {
    let mut file = std::fs:File::open();
    atomic!{
        // this will write bytes into a file. While this example
        // might be less extreme, any action on that file
        // is inherently irrevocably inside a transaction, without explicitly 
        // storing a buffered copy of file in order to restore its prior state

        file.write(b"");
    };
}

The example above shows an io operation, that is irrevocably for any transaction log. Thus each transaction log should only make benign memory modifications.

Blocking does not compose

Guide-level explanation

Overview

A Software transactional memory (STM) offers composable concurrency constructs. Bundling transactions to memory in an atomic way. A target object, that can be potentially be changed by concurrent processes is being cloned. Each read or write operation on this object is being recorded in a thread local copy of this object.

System Description

The following paragraph shortly describes the system, that an STM tries to support.

Task System T with tasks : t1,t2,...tn
Lock Lx:{y}: Lock x protects a set of resources y
Access to some resource la

Use Cases

Use cases shown below should illustrate, that writes, reads and execution of procedures are done atomically within the runtime. Minimal API changes have been introduced to showcase the integration of an STM.

Use cases with explicit atomic transactions

The use cases shown here use either explicit calls to an atomic function, that wraps a transaction to all referenced variables, or use an attribute macro to mark a function explicitly as transactional.

Ephemeral key derivation and encryption

use stm::{vars::TVar, atomic};
use crate::{Location, ClientId, Chain};
use crypto::aead::XChaCha20Poly1305;


impl Stronghold {

    pub fn create_ephemeral_keys_and_encrypt(
        &self, 
        id : ClientId,
        data : Vec<u8>,
        location : Location,
        chain: Chain) 
            -> Result<(Vec<u8>, Signature), crate::Error> 
    {   

        // all state changing vars should be listed here and referenced
        // by a transactional variable `TVar` 

        // We obtain a copy of the vault inside a transactional variable
        let t_vault : TVar<Vault> = self.vault_mut(&id).into();

        // TVar keeps an atomic reference to the lock variable, hence
        // TVar itself is clonable and threadsafe
        let t_vault_2 : TVar<Vault> = t_vault.clone();

        // conduct an atomic operation inside a thread. 
        let thread_a = std::thread::spawn(move| | { 
            // provide an explicit atomic block for all operations, that 
            // should run inside a transaction. `self` here can be automatically
            // wrapped into a transactional object, that records each operation
            // and commits it, if the transaction was uninterrupted or retries
            // at least once, according to a default strategy
            atomic(|transaction| {
                // Read and validate the correctness of `vault`, return a handle to the vault
                let vault               = t_vault.read(&transaction)?;

                // calling functions should not have side-effects, since transactions are always
                // reversible. `transaction` can be passed a log object using `with`
                let ephemeral_key       = transaction.
                                            with(&vault).
                                            runtime_exec(Slip10Derive::new_from_seed(location, chain));    

                // perform the encryption using the procedures API
                transaction.
                    crypto().
                    xchacha20poly2305().
                    encrypt(data, ephemeral_key, None)

            }).and_then(| result | { 
                // some post transactional operation

                result
            });
        })


        let thread_b = std::thread::spawn(move| | { 
            // provide an explicit atomic block for all operations, that 
            // should run inside a transaction. `self` here can be automatically
            // wrapped into a transactional object, that records each operation
            // and commits it, if the transaction was uninterrupted or retries
            // at least once, according to a default strategy
            atomic(|transaction| {
                // Read and validate the correctness of `vault`, return a handle to the vault
                let vault               = t_vault.read(&transaction)?;

                // some other operations
                // ...
               

            }).and_then(| result | { 
                // some post transactional operation

                result
            });
        });

        // via `read` we can access the inner value of the transactional variable
        let vault = t_vault.read().expect("Could get inner variable");


        // join threads
        [thread_a, thread_b].into_iter().for_each(|th| { th.join().expect("failed to join thread"); });
    }
}

Use case(s) with implicit atomic transactions

The use cases shown below assume some implicit / blanket implementations of transactional traits. Please note, that the traits mentioned here are for demonstration purposes only, and shall give an idea how an transactional memory system could be implemented. A more formal description of proper types and traits is out of scope for this RFC.

Writing multiple records into the vault and write a snapshot

This code example loads a client from a path, a new task is being spawned, that references
the client to write some data into the vault, at the end the current state will be persisted
inside a snapshot. Even though writing to the vault seemingly is being executed concurrently,
the execution itself is atomic: the state is written BEFORE the state will be written into the
snapshot. No data races occur, clients can be modified concurrently in a safe way.

/// Shows the usage of a transactional memory system in an asynchronous context
#[tokio::main]
pub async fn main() -> Result<(), TransactionalError> {

    let client_a = Client::load(b"some-client-path");
    let key = b"...";

    // spawn a new task to write into the vault
    // atomically
    tokio::spawn(|| {

        let location= Location::from("");
        let payload = b"image/png:data=base64;2jdhadhk"
        let hint    = vec![0xde, 0xad, 0xbe, 0xef]; 

        /// write into a vault within a transaction
        client_a.vault_mut(&id).
            insert(
                &location,
                &payload,
                &hint
        ).await?;
    });
    

    // write a snapshot, while moving the key
    // the state previously written into the vault is atomic
    // and will definitely be seen inside the snapshot
    client_a.snapshot().
        update("/path/to/snapshot/file").
        with_key(key).await?;
}

Reference-level explanation

In order to make transactions atomic, we need to ensure that each transaction has a thread local copy of the target variable. One way to do that is to keep a log with a value. The object itself is referenced by a TVar or transactional variable, that can be shared across threads (or in case of asynchronous runtime: tasks). Each TVar keeps track of it's own thread and all other threads, that potentially want to update the variable via VarAccessControl. Reading from a variable might involve some validation, if not other thread has been written to the same variable. In case it had, the transaction will be retried ( standard case,one can incorporate some counter here to mitigate possible infinite retries). The whole transaction is being reflected by a Transaction object, that manages the execution and finally commits the calculated value to memory.

Another approach would be a blanket implementation for all types, that satisfy a Transactional trait. Using this approach, sensible defaults can be inferred from the types used to ensure transactional memory.

/// Represents a transactional variable. Keeps the actual value, hidden from 
/// other threads / asynchronous tasks
#[derive(Clone)]
pub struct TVar<T> where T :  Send + Sync + SecureMemory {

    /// The inner variable of TVar will be kept inside a 
    /// `TControlBlockVar` which in turn keeps a reference to
    /// the accessing threads
    inner : Arc<TControlBlockVar<T>>
}   

impl<T> TVar<T> where T :  Send + Sync + SecureMemory {

    /// Reads the actual value atomically and returns it
    pub fn read_atomic(&self) -> Result<T, Error> { ... }

    /// Writes the actual value atomically and returns it.
    pub fn write_atomic(&self, value: T) -> Result<(), Error> { ... }
}

impl<T> Clone for TVar<T>
where
    T: Send + Sync + SecureMemory,
{
    /// We implement clone manually to always  
    /// return a copy of the atomic reference, and a copy
    /// of the value itself. 
    fn clone(&self) -> Self {
        TVar {
            block: self.block.clone(),
        }
    }
}


/// This keeps track of how other threads have access the variable
/// and tried to modify it.
pub struct TControlBlockVar<T>
where
    T: Send + Sync + SecureMemory,
{
    /// This field holds a number of waiting threads intending to 
    /// modify the inner data. The mutex itself holds a list of weak
    /// pointers to each control block, which can be atomically
    /// blocked or unblocked.
    waiting_threads: Mutex<Vec<Weak<ControlBlock>>>,

    /// This represents the target value to be modified. Since we are holding an 
    /// atomic reference the value itself cannot be modified, by replace with 
    /// the target value
    value: RwLock<Arc<T>>,
}

/// The Controlblock contains the a threads access and a flag, to indicate blocking
pub(crate) struct ControlBlock {
    thread: thread::Thread,
    blocked: AtomicBool,
}

/// Some logging structure to keep track of the modifications
pub enum TLog<T> where T :  Send + Sync + SecureMemory {

    /// The inner variable has been read by some thread
    Read(T),

    /// The inner var has been written to by some thread
    Write(T),

    /// The current logged state is invalid (reading, while another thread has 
    /// read inside a transaction)
    Invalid(T)
}

/// A `Transaction` shall work on a per type basis
pub struct Transaction<T> where T:  Send + Sync + SecureMemory
{   
    /// the inner variable as the commit target
    inner : Arc<TVar<T>>,

    /// a transaction log for all operations on var (`inner`)
    /// TBD: The current proposal employs a simple vector to store all
    /// transactional values, but we need to make sure, that 
    /// threads accessing a logvar must remain consistent, insofar that
    /// threads will not be blocked, or result in a dead lock. An implementation
    /// should make sure, that the log itself keeps in order to avoid dead locks
    log : Vec<LogVar<T>>

    // ...  more fields
}
/// The implementation of a `Transaction` is the main interface for all
/// transaction based types. A transaction is ideally started from the
/// outside with a provided `atomic` function
impl<T> for Transaction<T> where T :  Send + Sync + SecureMemory{

    /// Creates a new transaction with a function
    pub fn with_function<F>(tx : F) -> Self 
    where 
        F : Fn(&Transaction) -> Result<(), Error> { ... }   

    /// Commits the transaction to memory eg. write the calculate value to memory
    pub fn commit(&self) -> Result<(), Error> { ... }

    /// read a transactional variable atomically
    pub fn read(&self, var : &TVar<T>) -> T { ... }

    /// write a transactional variable atomically
    pub fn write(&self, var : &TVar<T>, value : T) -> Result<(), Error> {
        // ...
    }

    /// retries the transaction, if the transactional variable is
    /// in an illegal state
    pub fn retry(&self) -> Result<(), Error> {
        // ...
    }
}


// .. alternatively transaction could be a trait

/// Transactional trait
pub trait Transactional<T> where T :  Send + Sync+ SecureMemory {
    type Error;

    fn read(&self, var : &TVar<T>) -> T;

    fn write(&self, var : &TVar<T>, value : T) -> Result<(), Self::Error>;

    fn commit(&self) -> Result<(), Self::Error>
}

/// this trait could then be implemented for all types 
/// with given trait bounds
impl<T> Transactional<T> for T where T : Send + Sync + SecureMemory{
    /// ...
}

Integration Into Existing Stronghold Types

The following subsections describe possible integration of transactional memory on existing types and components. Each subsection discusses possible integrations and their issues, and ideally how to overcome them.

Transactional Clients

Clients should be able to work in concurrent contexts, this their inner state should remain consistent, providing an atomic reference to their objects, so the inner state will not be copied.

Transactional Vaults

Vaults can be accessed via DbView. Each Vault can be locked in a TVar, thus isolating access to the state itself, exposing it to modifications via TLog. Since Clients themselves should not directly be transactional, each atomic modification of the Vaults state should be done here. TVar differs here from TLog, since the vault entries themselves are protected.

Transactional Snapshots

Regarding Snapshots we need to distinguish between Snapshots retained encrypted in memory, and Snapshot-files. Since the former can be directly put into a transaction, the latter could write non-revertible state into the file, possibly inducing to keep a log of operations itself, which is not desirable. Instead of putting the write ( read "io" here ) operations into a transaction, a successful commit to the Snapshot state could trigger an atomic write into the Snapshot-file itself to always have a consistent state of the snapshot file.

Transactional Procedures

Procedures mainly rely on secrets inside the Vault, each operation shall make use of the transactions.

Security

Since working with transactional memory involves writing designated state into a target state, the designated state written into and stored by a TLog must be memory secure. The STM makes use of the underlying protection mechanism, provided by the new runtime RFC. The interface to obtain secured memory regions (non-contiguous) looks roughly like this:

/// GuardedMemory keeps a protected region on memory 
/// TBD: 
/// - operations on this type should give access to the inner data.
/// - the specific implementation of this type
/// - description of functionality
pub struct GuardedMemory { ... }

pub trait SecureMemory: Zeroize + Sized {
    type Error;
    type Key;

    /// Allocates at least std::mem::sizeof_val(payload) + guarded pages,
    /// returns itself as guarded smart pointer to memory, zeroes out 
    /// the payload after successful creation.  
    /// The underlying memory is locked by default.
    /// TBD: `SecureMemory` shall also take care of transparently encrypting
    /// the underlying data with a transparently used composite key
    fn alloc<T>(payload: T) -> Result<Self, Self::Error>
    where
        T: Zeroize + AsRef<Vec<u8>>;

    /// Takes in a previously returned `GuardedMemory`, encrypts it with the provided 
    /// Key type (TBD: this could also be a `KeyProvider` storing either some kind of
    /// composite key, or give access to some hardware secure module) and returns 
    /// a possible reallocated memory wrapped up in a self type. 
    fn lock(self, memory: GuardedMemory, key: Self::Key) -> Result<Self, Self::Error>;

    /// Unlocks the underlying memory, decrypts it with provided `key`, 
    /// and returns a `GuardedMemory` type. We assume here, that the 
    /// returned `GuardedMemory` can give access to a mutable reference 
    /// in order to work with the data.
    fn unlock(&self, key: Self::Key) -> GuardedMemory;
}

Integration of `SecureMemory`

As already outlined above, TLog and TVar are strongly bound to SecureMemory. Each read and write operation on the respective types, make heavy use the secure memory api. This would induce a performance penalty as unlocking and decrypting would take a certain amount of time, but each memory operation would be secure by default.

Drawbacks

Why should we not do this?

Since STM use optimistic locking, long transaction could hang / retry for a very long time. This can be mitigated that by using a similar construct to a circuit breaker, the retries can fail after a number of times returning error from the underlying transaction.

Rationale and alternatives

Why is this design the best in the space of possible designs?

STM ease the use of concurrent processes, offer the user a more natural way to call functionality in a safe way. Employing an STM also makes the inner workings of synchronizing transparent.

What other designs have been considered and what is the rationale for not choosing them?

Another approach for handling concurrent processes is an actor system like actix. The prior framework riker has been deprecated, while actix takes over the asynchronous runtime, that makes the integration of Stronghold into existing applications security aware, moves the responsibility of synchronization back to the implementer.

What is the impact of not doing this?

The responsibility to operate Stronghold safely in a concurrent environment will still be shifted to the developer. Even though Stronghold can is being used in concurrent system, it does not offer the benefits of a multi tenant system itself.

Unresolved questions

What parts of an atomic transaction shall be isolated, and how?
- partially answered on integration
How can we solved nested transactions? Shall outer transactions be validated first?
How would an async interface work for transactional types?

Future possibilities

none so far.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Software Transactional Memory #307

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Software Transactional Memory #307

Uh oh!

felsweg-iota Jan 26, 2022

Summary

Motivation

Removal of Actix as actor system

Advantages over actor systems

Apriori Problems with Composable Transaction

Every transaction are only allowed to non-irrevocably operations

Blocking does not compose

Guide-level explanation

Overview

System Description

Use Cases

Use cases with explicit atomic transactions

Ephemeral key derivation and encryption

Use case(s) with implicit atomic transactions

Writing multiple records into the vault and write a snapshot

Reference-level explanation

Integration Into Existing Stronghold Types

Transactional Clients

Transactional Vaults

Transactional Snapshots

Transactional Procedures

Security

Integration of SecureMemory

Drawbacks

Why should we not do this?

Rationale and alternatives

Why is this design the best in the space of possible designs?

What other designs have been considered and what is the rationale for not choosing them?

What is the impact of not doing this?

Unresolved questions

Future possibilities

Resources

Replies: 0 comments

felsweg-iota
Jan 26, 2022

Integration of `SecureMemory`