Stream Kernel

Moving forward we will need a stream ``kernel`` for the data processing layer that is specifically designed for Arcon.  This issue serves as a direction (I.e not prioritised in short term).

Related issues: 

https://github.com/cda-group/arcon/issues/277
https://github.com/cda-group/arcon/issues/246
https://github.com/cda-group/arcon/issues/214
# Kernel
The Kernel represents an application-level OS that manages Task scheduling, memory management, and I/O. The idea is to cooperatively schedule a set of tasks on a single core in order to get better CPU utilisation + locality between tasks + avoid context switches.  As noted [here](https://redpanda.com/blog/tpc-buffers/),  storage and networking are no longer the bottlenecks in a modern data center, but CPUs are.

Running a Thread-Per-Core model with cooperative scheduling is not something unique itself. Down below are some data-parallel systems that execute it with success.

1. [ScyllaDB](https://github.com/scylladb/scylla)
2. [Hazelcast Jet](http://vldb.org/pvldb/vol14/p3110-katsifodimos.pdf)
3. [Redpanda](https://redpanda.com/blog/tpc-buffers/)

Rough Overview Sketch

![kernel_arch](https://user-images.githubusercontent.com/11488530/154857371-9c206bb1-9987-45c7-8fcf-20ee62608066.PNG)


The Kernel has the following context that is shared between tasks that it executes:

1. Time (Watermark)
2. Epoch
3. Logger
4. State Backend(s)

## Task
A cooperative async long-running future that drives the execution of a dataflow node. A Task must always check if it should yield back control to other tasks in order to not block progress.

```rust
// source example
let source = ...;
let task: Task = async move {
   loop {
     while let Some(elem) = source.poll_next().await {
       // output elem to next task.
       // yield back control to scheduler at times..
     }
  }
};
```

Tasks may send their output in 3 different ways:

1. Intra-Kernel with ``Rc<RefCell<Vec<T>>>``
2. Local Inter-Kernel with explicit queues (e.g., spsc)
3. Remote Inter-Kernel over the wire

[Application-level Task Scheduling](https://www.p99conf.io/session/keeping-latency-low-and-throughput-high-with-application-level-priority-management/)

## API levels
Suggested by @segeljakt 

High-level API: builtin operators (map, filter, window, keyby)
Mid-level API: operator constructors + event handlers  
Low-level API: tasks/async functions + channels  

## Async-friendly Runtime
Currently, it is hard to support ``async`` interfaces/crates. Two prime examples are source implementations and supporting state backends that are async.

```rust
// Rough sketching of source, operator, and state interfaces.

#[async_trait]
pub trait Source {
    type Item: ArconType;
    async fn poll_next(&mut self) -> Result<Poll<Self::Item>>;
}

#[async_trait]
pub trait Operator {
    async fn handle_elem(&mut self, elem: ArconElement<Self::IN>);
}

#[async_trait]
pub trait ValueIndex<V: Value> {
   async fn get(&mut self) -> Result<Option<V>>;
   async fn put(&mut self, value: V) -> Result<()>;
}
```

## Glommio

[Glommio](https://github.com/DataDog/glommio) is a Seastar inspired TPC cooperative threading framework built in Rust. It relies on Linux and its ``io_uring`` functionality. This is the only notable downside of adopting Gloomio for Arcon. That is, making it a Linux only system. But then again, data-intensive systems such as Arcon are supposed to run on Linux anyway. 

Another downside is that Glommio runs on the assumption that the machine has NVMe storage.  Specific OS + Hardware requirements will make it harder to run or to contribute to Arcon.

Pros:

1. Configurable scheduling priority (latency matters vs. not)
2. First-class ``io_uring`` and Direct I/O support (future backend)
3. Designed to be used in data and I/O intensive systems (Arcon)
4. Offers placement strategies (NUMA etc..)

Cons:

1. Linux + Hardware Requirements
2. Not as many users/contributors as for example ``tokio``

Article about Gloomio may be found [here](https://www.datadoghq.com/blog/engineering/introducing-glommio/).

## Other Async runtime candidates

1. [monoio](https://github.com/bytedance/monoio)
2. [tokio](https://github.com/tokio-rs/tokio)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream Kernel #311

Kernel

Task

API levels

Async-friendly Runtime

Glommio

Other Async runtime candidates

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stream Kernel #311

Description

Kernel

Task

API levels

Async-friendly Runtime

Glommio

Other Async runtime candidates

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions