Skip to content

Commit 7defd09

Browse files
Andy Rossnashif
authored andcommitted
samples: Add MetaIRQ dispatch sample
MetaIRQs are described in docs and exercised in tests, but there's no sample explaining how they are intended to be used to perform interrupt bottom half processing. This simple tool spawns a set of worker threads at different priorities (some cooperative) which process "messages" (which require CPU time to handle) received from a fake "device" implemented with timer interrupts. The device hands off the events directly to a MetaIRQ thread which is expected to parse and dispatch them to the worker threads. The test demonstrates that no matter the state of the system, the MetaIRQ thread always runs synchronously when woken by the ISR and is able to preempt all worker threads to do its job. Signed-off-by: Andy Ross <[email protected]>
1 parent cea6e3f commit 7defd09

File tree

9 files changed

+523
-0
lines changed

9 files changed

+523
-0
lines changed

samples/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Samples and Demos
2424
portability/*
2525
posix/*
2626
video/*
27+
scheduler/*
2728
smp/*
2829

2930
.. comment

samples/scheduler/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
.. _scheduler-samples:
2+
3+
Various Kernel and Scheduler Samples
4+
####################################
5+
6+
.. toctree::
7+
:maxdepth: 1
8+
:glob:
9+
10+
**/*
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
3+
cmake_minimum_required(VERSION 3.13.1)
4+
5+
include($ENV{ZEPHYR_BASE}/cmake/app/boilerplate.cmake NO_POLICY_SCOPE)
6+
project(metairq_dispatch)
7+
8+
target_sources(app PRIVATE src/main.c src/msgdev.c)
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
MetaIRQ Thread Priority Demonstration
2+
#####################################
3+
4+
Overview
5+
********
6+
7+
This sample demonstrates the use of a thread running at a MetaIRQ
8+
priority level to implement "bottom half" style processing
9+
synchronously with the end of a hardware ISR. It implements a
10+
simulated "device" that produces messages that need to be dispatched
11+
to asynchronous queues feeding several worker threads, each running at
12+
a different priority. The dispatch is handled by a MetaIRQ thread fed
13+
via a queue from the device ISR (really just a timer interrupt).
14+
15+
Each message has a random (and non-trivial) amount of processing that
16+
must happen in the worker thread. This implements a "bursty load"
17+
environment where occassional spikes in load require preemption of
18+
running threads and delay scheduling of lower priority threads.
19+
Messages are accompanied by a timestamp that allows per-message
20+
latencies to be computed at several points:
21+
22+
* The cycle time between message creation in the ISR and receipt by
23+
the MetaIRQ thread for dispatch.
24+
25+
* The time between ISR and receipt by the worker thread.
26+
27+
* The real time spent processing the message in the worker thread, for
28+
comparison with the required processing time. This provides a way
29+
to measure preemption overhead where the thread is not scheduled.
30+
31+
Aspects to note in the results:
32+
33+
* On average, higher priority (lower numbered) threads have better
34+
latencies and lower processing delays, as expected.
35+
36+
* Cooperatively scheduled threads have significantly better processing
37+
delay behavior than preemtible ones, as they can only be preempted
38+
by the MetaIRQ thread.
39+
40+
* Because of queueing and the bursty load, all worker threads of any
41+
priority will experience some load-dependent delays, as the CPU
42+
occasionally has more work to do than time available.
43+
44+
* But, no matter the system load or thread configuration, the MetaIRQ
45+
thread always runs immediately after the ISR. It shows reliable,
46+
constant latency under all circumstances because it can preempt all
47+
other threads, including cooperative ones that cannot normally be
48+
preempted.
49+
50+
Requirements
51+
************
52+
53+
This sample should run well on any Zephyr platform that provides
54+
preemption of running threads by interrupts, a working timer driver,
55+
and working log output. For precision reasons, it produces better
56+
(and more) data on systems with a high timer tick rate (ideally 10+
57+
kHz).
58+
59+
Note that because the test is fundamentally measuring thread
60+
preemption behavior, it does not run without modification on
61+
native_posix platforms. In that emulation environment, threads will
62+
not be preempted except at specific instrumentation points (e.g. in
63+
k_busy_wait()) where they will voluntarily release the CPU.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
CONFIG_ASSERT=y
2+
CONFIG_NUM_METAIRQ_PRIORITIES=1
3+
CONFIG_TEST_RANDOM_GENERATOR=y
4+
CONFIG_LOG=y
5+
CONFIG_LOG_MINIMAL=y
6+
7+
# We're testing delivery latency to threads on a single CPU, MP
8+
# dispatch will mess up the statistics.
9+
CONFIG_MP_NUM_CPUS=1
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
sample:
2+
description: Sample demonstrating dispatch of hardware events from a
3+
MetaIRQ thread
4+
name: MetaIRQ Dispatch
5+
common:
6+
harness: console
7+
harness_config:
8+
type: one_line
9+
regex:
10+
- "MetaIRQ Test Complete"
11+
12+
# Note that native_posix architectures are filtered, they require
13+
# instrumentation (e.g. a k_busy_wait()) inside the worker threads
14+
# "busy" loops in order for the interrupts to fire on time, and the
15+
# sample is designed to demonstrate completely arbitrary CPU work.
16+
tests:
17+
sample.metairq_dispatch:
18+
tags: introduction
19+
filter: not CONFIG_ARCH_POSIX
Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
/*
2+
* Copyright (c) 2020 Intel Corporation
3+
*
4+
* SPDX-License-Identifier: Apache-2.0
5+
*/
6+
#include <zephyr.h>
7+
#include "main.h"
8+
9+
#include <logging/log.h>
10+
LOG_MODULE_REGISTER(main, LOG_LEVEL_INF);
11+
12+
#define STACK_SIZE 2048
13+
14+
/* How many messages can be queued for a single thread */
15+
#define QUEUE_DEPTH 16
16+
17+
/* Array of worker threads, and their stacks */
18+
static struct thread_rec {
19+
struct k_thread thread;
20+
struct k_msgq msgq;
21+
struct msg msgq_buf[QUEUE_DEPTH];
22+
} threads[NUM_THREADS];
23+
24+
K_THREAD_STACK_ARRAY_DEFINE(thread_stacks, NUM_THREADS, STACK_SIZE);
25+
26+
/* The static metairq thread we'll use for dispatch */
27+
static void metairq_fn(void *p1, void *p2, void *p3);
28+
K_THREAD_DEFINE(metairq_thread, STACK_SIZE, metairq_fn,
29+
NULL, NULL, NULL, K_HIGHEST_THREAD_PRIO, 0, K_NO_WAIT);
30+
31+
/* Accumulated list of latencies, for a naive variance computation at
32+
* the end.
33+
*/
34+
struct {
35+
atomic_t num_mirq;
36+
u32_t mirq_latencies[MAX_EVENTS];
37+
struct {
38+
u32_t nevt;
39+
u32_t latencies[MAX_EVENTS * 2 / NUM_THREADS];
40+
} threads[NUM_THREADS];
41+
} stats;
42+
43+
/* A semaphore with an initial count, used to allow only one thread to
44+
* log the final report.
45+
*/
46+
K_SEM_DEFINE(report_cookie, 1, 1);
47+
48+
static void metairq_fn(void *p1, void *p2, void *p3)
49+
{
50+
ARG_UNUSED(p1);
51+
ARG_UNUSED(p2);
52+
ARG_UNUSED(p3);
53+
54+
while (true) {
55+
/* Receive a message, immediately check a timestamp
56+
* and compute a latency value, then dispatch it to
57+
* the queue for its target thread
58+
*/
59+
struct msg m;
60+
61+
message_dev_fetch(&m);
62+
m.metairq_latency = k_cycle_get_32() - m.timestamp;
63+
64+
int ret = k_msgq_put(&threads[m.target].msgq, &m, K_NO_WAIT);
65+
66+
if (ret) {
67+
LOG_INF("Thread %d queue full, message %d dropped",
68+
m.target, m.seq);
69+
}
70+
}
71+
}
72+
73+
/* Simple recursive implementation of an integer square root, cribbed
74+
* from wikipedia
75+
*/
76+
static u32_t isqrt(u64_t n)
77+
{
78+
if (n > 1) {
79+
u64_t lo = isqrt(n >> 2) << 1;
80+
u64_t hi = lo + 1;
81+
82+
return (u32_t)(((hi * hi) > n) ? lo : hi);
83+
}
84+
return (u32_t) n;
85+
}
86+
87+
static void calc_stats(const u32_t *array, u32_t n,
88+
u32_t *lo, u32_t *hi, u32_t *mean, u32_t *stdev)
89+
{
90+
u64_t tot = 0, totsq = 0;
91+
92+
*lo = INT_MAX;
93+
*hi = 0;
94+
for (int i = 0; i < n; i++) {
95+
*lo = MIN(*lo, array[i]);
96+
*hi = MAX(*hi, array[i]);
97+
tot += array[i];
98+
}
99+
100+
*mean = (u32_t)((tot + (n / 2)) / n);
101+
102+
for (int i = 0; i < n; i++) {
103+
s64_t d = (s32_t) (array[i] - *mean);
104+
105+
totsq += d * d;
106+
}
107+
108+
*stdev = isqrt((totsq + (n / 2)) / n);
109+
}
110+
111+
static void record_latencies(struct msg *m, u32_t latency)
112+
{
113+
/* Workaround: qemu emulation shows an erroneously high
114+
* metairq latency for the very first event of 7-8us. Maybe
115+
* it needs to fault in the our code pages in the host?
116+
*/
117+
if (IS_ENABLED(CONFIG_QEMU_TARGET) && m->seq == 0) {
118+
return;
119+
}
120+
121+
int t = m->target;
122+
int lidx = stats.threads[t].nevt++;
123+
124+
if (lidx < ARRAY_SIZE(stats.threads[t].latencies)) {
125+
stats.threads[t].latencies[lidx] = latency;
126+
}
127+
128+
stats.mirq_latencies[atomic_inc(&stats.num_mirq)] = m->metairq_latency;
129+
130+
/* Once we've logged our final event, print a report. We use
131+
* a semaphore with an initial count of 1 to ensure that only
132+
* one thread gets to do this. Also events can be processed
133+
* out of order, so add a small sleep to let the queues
134+
* finish.
135+
*/
136+
if (m->seq == MAX_EVENTS - 1) {
137+
u32_t hi, lo, mean, stdev, ret;
138+
139+
ret = k_sem_take(&report_cookie, K_FOREVER);
140+
__ASSERT_NO_MSG(ret == 0);
141+
k_sleep(100);
142+
143+
calc_stats(stats.mirq_latencies, stats.num_mirq,
144+
&lo, &hi, &mean, &stdev);
145+
146+
LOG_INF(" ---------- Latency (cyc) ----------");
147+
LOG_INF(" Best Worst Mean Stdev");
148+
LOG_INF("MetaIRQ %8d %8d %8d %8d", lo, hi, mean, stdev);
149+
150+
151+
for (int i = 0; i < NUM_THREADS; i++) {
152+
if (stats.threads[i].nevt == 0) {
153+
LOG_WRN("No events for thread %d", i);
154+
continue;
155+
}
156+
157+
calc_stats(stats.threads[i].latencies,
158+
stats.threads[i].nevt,
159+
&lo, &hi, &mean, &stdev);
160+
161+
LOG_INF("Thread%d %8d %8d %8d %8d",
162+
i, lo, hi, mean, stdev);
163+
}
164+
165+
LOG_INF("MetaIRQ Test Complete");
166+
}
167+
}
168+
169+
static void thread_fn(void *p1, void *p2, void *p3)
170+
{
171+
ARG_UNUSED(p2);
172+
ARG_UNUSED(p3);
173+
int id = (long)p1;
174+
struct msg m;
175+
176+
LOG_INF("Starting Thread%d at priority %d", id,
177+
k_thread_priority_get(k_current_get()));
178+
179+
while (true) {
180+
int ret = k_msgq_get(&threads[id].msgq, &m, K_FOREVER);
181+
u32_t start = k_cycle_get_32();
182+
183+
__ASSERT_NO_MSG(ret == 0);
184+
185+
/* Spin on the CPU for the requested number of cycles
186+
* doing the "work" required to "process" the event.
187+
* Note the inner loop: hammering on k_cycle_get_32()
188+
* on some platforms requires locking around the timer
189+
* driver internals and can affect interrupt latency.
190+
* Obviously we may be preempted as new events arrive
191+
* and get queued.
192+
*/
193+
while (k_cycle_get_32() - start < m.proc_cyc) {
194+
for (volatile int i = 0; i < 100; i++) {
195+
}
196+
}
197+
198+
u32_t dur = k_cycle_get_32() - start;
199+
200+
#ifdef LOG_EVERY_EVENT
201+
/* Log the message, its thread, and the following cycle values:
202+
* 1. Receive it from the driver in the MetaIRQ thread
203+
* 2. Begin processing it out of the queue in the worker thread
204+
* 3. The requested processing time in the message
205+
* 4. The actual time taken to process the message
206+
* (may be higher if the thread was preempted)
207+
*/
208+
LOG_INF("M%d T%d mirq %d disp %d proc %d real %d",
209+
m.seq, id, m.metairq_latency,
210+
start - m.timestamp, m.proc_cyc, dur);
211+
#endif
212+
213+
/* Collect the latency values in a big statistics array */
214+
record_latencies(&m, start - m.timestamp);
215+
}
216+
}
217+
218+
void main(void)
219+
{
220+
for (long i = 0; i < NUM_THREADS; i++) {
221+
/* Each thread gets a different priority. Half should
222+
* be at (negative) cooperative priorities. Lower
223+
* thread numbers have higher priority values,
224+
* e.g. thread 0 will be preempted only by the
225+
* metairq.
226+
*/
227+
int prio = (-NUM_THREADS/2) + i;
228+
229+
k_msgq_init(&threads[i].msgq, (char *)threads[i].msgq_buf,
230+
sizeof(struct msg), QUEUE_DEPTH);
231+
232+
k_thread_create(&threads[i].thread,
233+
thread_stacks[i], STACK_SIZE,
234+
thread_fn, (void *)i, NULL, NULL,
235+
prio, 0, K_NO_WAIT);
236+
}
237+
238+
message_dev_init();
239+
}

0 commit comments

Comments
 (0)