Skip to content

Event Tracing System for Kernel Development #25

@HeatCrab

Description

@HeatCrab

Background

Currently, Linmo lacks a mechanism to record and analyze the sequence of system events during execution. When debugging complex issues such as race conditions, priority inversion, or unexpected task scheduling behavior, developers have no way to understand what happened leading up to the problem.

This makes kernel development difficult because:

  • No execution history: Cannot see what events occurred before a problem manifested
  • Timing issues are invisible: Race conditions and scheduling anomalies are hard to diagnose
  • Context is lost: By the time a problem is noticed, the relevant events are gone
  • printf() is disruptive: Adding debug output changes timing and can hide bugs

For kernel developers, the ability to trace system events provides crucial insights into dynamic behavior and helps diagnose issues that cannot be understood from static snapshots alone.

Proposed Solution

Implement a lightweight event tracing system that records kernel events to a ring buffer, allowing developers to examine the sequence of operations leading up to a problem.

Goals

  1. Basic event recording: Capture key system events (task switches, IPC operations, interrupts)
  2. Minimal overhead: Low performance impact when enabled, zero cost when disabled
  3. Simple but extensible: Start with essential features, design for future expansion
  4. Easy integration: Simple API for recording and retrieving events
  5. Compile-time control: Can be completely disabled to save memory

Detailed Requirements

1. Event Types

Define a comprehensive set of event types covering the three main kernel subsystems:

Scheduler Events:

  • EVENT_TASK_CREATE - Task spawned
  • EVENT_TASK_DESTROY - Task cancelled
  • EVENT_TASK_SWITCH - Context switch occurred
  • EVENT_TASK_SUSPEND - Task suspended
  • EVENT_TASK_RESUME - Task resumed
  • EVENT_TASK_DELAY - Task entered delay state
  • EVENT_TASK_YIELD - Task voluntarily yielded CPU

IPC Events:

  • EVENT_SEM_WAIT - Semaphore wait operation
  • EVENT_SEM_POST - Semaphore post operation
  • EVENT_MUTEX_LOCK - Mutex lock operation
  • EVENT_MUTEX_UNLOCK - Mutex unlock operation
  • EVENT_PIPE_READ - Pipe read operation
  • EVENT_PIPE_WRITE - Pipe write operation
  • EVENT_MQUEUE_SEND - Message queue enqueue operation
  • EVENT_MQUEUE_RECV - Message queue dequeue operation

ISR Events:

  • EVENT_ISR_ENTER - Interrupt handler entry
  • EVENT_ISR_EXIT - Interrupt handler exit
  • EVENT_EXCEPTION - Exception occurred

Extensibility Note:

  • Event types defined as enum for easy addition
  • Generic param1/param2 fields allow recording event-specific information
  • Design allows adding new types (timer callbacks, custom events) without breaking existing code

2. Event Data Structure

Each event should record essential information:

typedef struct {
    uint32_t timestamp;    /* System ticks when event occurred */
    uint8_t  event_type;   /* Event type from enum */
    uint8_t  task_id;      /* Current task ID */
    uint16_t reserved;     /* Reserved for future use */
    uint32_t param1;       /* Event-specific parameter 1 */
    uint32_t param2;       /* Event-specific parameter 2 */
} debug_event_t;

Design rationale:

  • Fixed size (16 bytes) for predictable memory usage
  • Generic param1/param2 allows different events to record different information
  • Reserved field for future expansion without changing structure size
  • Timestamp enables temporal analysis

Parameter usage examples:

  • EVENT_TASK_SWITCH: param1 = from_task_id, param2 = to_task_id
  • EVENT_TASK_SUSPEND: param1 = task_id, param2 = 0
  • EVENT_TASK_RESUME: param1 = task_id, param2 = 0
  • EVENT_TASK_DELAY: param1 = delay_ticks, param2 = 0
  • EVENT_TASK_YIELD: param1 = task_id, param2 = 0
  • EVENT_SEM_WAIT: param1 = semaphore address, param2 = 0
  • EVENT_PIPE_READ: param1 = pipe address, param2 = bytes_read
  • EVENT_PIPE_WRITE: param1 = pipe address, param2 = bytes_written
  • EVENT_MQUEUE_SEND: param1 = queue address, param2 = message_type
  • EVENT_MQUEUE_RECV: param1 = queue address, param2 = message_type
  • EVENT_ISR_ENTER: param1 = interrupt code, param2 = 0

3. Ring Buffer Storage

Use a fixed-size ring buffer for event storage:

Configuration:

#define CONFIG_DEBUG_TRACE 1              /* Enable tracing */
#define DEBUG_EVENT_BUFFER_SIZE 256       /* Number of events */

Characteristics:

  • Fixed size determined at compile time
  • Oldest events are overwritten when buffer is full
  • No dynamic memory allocation
  • Simple index-based access

Extensibility considerations:

  • Buffer size can be adjusted via #define
  • Structure supports future per-event-type buffers if needed
  • Can later add buffer full notifications

4. Core API

Provide minimal but complete API:

/* Record an event */
void debug_trace_event(uint8_t event_type, uint32_t param1, uint32_t param2);

/* Display all recorded events */
void debug_dump_events(void);

/* Clear event buffer */
void debug_clear_events(void);

Future expansion points:

  • Can add debug_trace_enable(mask) for runtime control
  • Can add debug_dump_events_filtered(type) for selective display
  • Can add debug_get_event_count(type) for statistics

5. Integration Points

Events should be recorded at key locations in the kernel:

Scheduler:

  • mo_task_spawn() - Record TASK_CREATE
  • mo_task_cancel() - Record TASK_DESTROY
  • dispatcher() / _dispatch() - Record TASK_SWITCH
  • mo_task_suspend() - Record TASK_SUSPEND
  • mo_task_resume() - Record TASK_RESUME
  • mo_task_delay() - Record TASK_DELAY
  • mo_task_yield() / yield() - Record TASK_YIELD

IPC - Semaphore/Mutex:

  • mo_sem_wait() - Record SEM_WAIT
  • mo_sem_post() - Record SEM_POST
  • mo_mutex_lock() - Record MUTEX_LOCK
  • mo_mutex_unlock() - Record MUTEX_UNLOCK

IPC - Pipe:

  • mo_pipe_read() / mo_pipe_nbread() - Record PIPE_READ
  • mo_pipe_write() / mo_pipe_nbwrite() - Record PIPE_WRITE

IPC - Message Queue:

  • mo_mq_enqueue() - Record MQUEUE_SEND
  • mo_mq_dequeue() - Record MQUEUE_RECV

ISR:

  • do_trap() - Record ISR_ENTER/EXIT for interrupts
  • do_trap() - Record EXCEPTION for exceptions

Implementation note:

  • Use #ifdef CONFIG_DEBUG_TRACE to make tracing optional
  • Keep trace calls minimal to reduce impact on traced code
  • Consider adding trace points gradually (start with scheduler, then add IPC)

Implementation Suggestions

Event Type Definitions

/* Event type enumeration - easily extensible */
typedef enum {
    /* Scheduler events */
    EVENT_TASK_CREATE = 0,
    EVENT_TASK_DESTROY,
    EVENT_TASK_SWITCH,
    EVENT_TASK_SUSPEND,
    EVENT_TASK_RESUME,
    EVENT_TASK_DELAY,
    EVENT_TASK_YIELD,
    
    /* IPC events */
    EVENT_SEM_WAIT,
    EVENT_SEM_POST,
    EVENT_MUTEX_LOCK,
    EVENT_MUTEX_UNLOCK,
    EVENT_PIPE_READ,
    EVENT_PIPE_WRITE,
    EVENT_MQUEUE_SEND,
    EVENT_MQUEUE_RECV,
    
    /* ISR events */
    EVENT_ISR_ENTER,
    EVENT_ISR_EXIT,
    EVENT_EXCEPTION,
    
    /* Reserve space for future event types */
    /* EVENT_TIMER_CALLBACK, */
    /* EVENT_TIMER_START, */
    /* EVENT_TIMER_CANCEL, */
    
    EVENT_TYPE_MAX
} debug_event_type_t;

Ring Buffer Implementation

/* Ring buffer for event storage */
#if CONFIG_DEBUG_TRACE

static debug_event_t event_buffer[DEBUG_EVENT_BUFFER_SIZE];
static uint32_t event_write_index = 0;
static uint32_t event_count = 0;

/* Record an event to the ring buffer.
 * This is called from various points in the kernel to log events.
 * Designed to be fast and non-blocking.
 */
void debug_trace_event(uint8_t event_type, uint32_t param1, uint32_t param2)
{
    /* Get current timestamp */
    uint32_t timestamp = mo_ticks();
    
    /* Get current task ID safely */
    uint8_t task_id = 0;
    if (kcb && kcb->task_current && kcb->task_current->data) {
        task_id = ((tcb_t *)kcb->task_current->data)->id;
    }
    
    /* Write event to buffer */
    debug_event_t *event = &event_buffer[event_write_index];
    event->timestamp = timestamp;
    event->event_type = event_type;
    event->task_id = task_id;
    event->reserved = 0;
    event->param1 = param1;
    event->param2 = param2;
    
    /* Advance write index (circular) */
    event_write_index = (event_write_index + 1) % DEBUG_EVENT_BUFFER_SIZE;
    
    /* Track total events (saturates at buffer size) */
    if (event_count < DEBUG_EVENT_BUFFER_SIZE) {
        event_count++;
    }
}

#else
/* When tracing is disabled, make it a no-op */
#define debug_trace_event(type, p1, p2) do {} while(0)
#endif

Event Display Function

#if CONFIG_DEBUG_TRACE

/* Helper: Convert event type to string */
static const char *event_type_to_string(uint8_t type)
{
    switch (type) {
        /* Scheduler events */
        case EVENT_TASK_CREATE:   return "TASK_CREATE";
        case EVENT_TASK_DESTROY:  return "TASK_DESTROY";
        case EVENT_TASK_SWITCH:   return "TASK_SWITCH";
        case EVENT_TASK_SUSPEND:  return "TASK_SUSPEND";
        case EVENT_TASK_RESUME:   return "TASK_RESUME";
        case EVENT_TASK_DELAY:    return "TASK_DELAY";
        case EVENT_TASK_YIELD:    return "TASK_YIELD";
        
        /* IPC events */
        case EVENT_SEM_WAIT:      return "SEM_WAIT";
        case EVENT_SEM_POST:      return "SEM_POST";
        case EVENT_MUTEX_LOCK:    return "MUTEX_LOCK";
        case EVENT_MUTEX_UNLOCK:  return "MUTEX_UNLOCK";
        case EVENT_PIPE_READ:     return "PIPE_READ";
        case EVENT_PIPE_WRITE:    return "PIPE_WRITE";
        case EVENT_MQUEUE_SEND:   return "MQUEUE_SEND";
        case EVENT_MQUEUE_RECV:   return "MQUEUE_RECV";
        
        /* ISR events */
        case EVENT_ISR_ENTER:     return "ISR_ENTER";
        case EVENT_ISR_EXIT:      return "ISR_EXIT";
        case EVENT_EXCEPTION:     return "EXCEPTION";
        
        default:                  return "UNKNOWN";
    }
}

/* Display all recorded events in chronological order.
 * This walks the ring buffer and prints events in a readable format.
 */
void debug_dump_events(void)
{
    if (event_count == 0) {
        printf("No events recorded\n");
        return;
    }
    
    printf("\n=== Event Trace ===\n");
    printf("Tick   Event            Task  Param1      Param2\n");
    printf("------ ---------------- ----- ----------- -----------\n");
    
    /* Calculate starting index (oldest event) */
    uint32_t start_index;
    if (event_count < DEBUG_EVENT_BUFFER_SIZE) {
        /* Buffer not full yet, start from beginning */
        start_index = 0;
    } else {
        /* Buffer full, start from oldest (one after write position) */
        start_index = event_write_index;
    }
    
    /* Print events in chronological order */
    for (uint32_t i = 0; i < event_count; i++) {
        uint32_t idx = (start_index + i) % DEBUG_EVENT_BUFFER_SIZE;
        debug_event_t *event = &event_buffer[idx];
        
        printf("%-6u %-16s %-5u 0x%08x  0x%08x\n",
               event->timestamp,
               event_type_to_string(event->event_type),
               event->task_id,
               event->param1,
               event->param2);
    }
    
    printf("\nTotal events: %u", event_count);
    if (event_count >= DEBUG_EVENT_BUFFER_SIZE) {
        printf(" (buffer full, oldest events overwritten)");
    }
    printf("\n");
}

/* Clear all recorded events */
void debug_clear_events(void)
{
    event_write_index = 0;
    event_count = 0;
}

#endif /* CONFIG_DEBUG_TRACE */

Example Integration: Task Switch

/* In kernel/task.c - dispatcher function */
void dispatcher(void)
{
    kcb->ticks++;
    
    /* Record task switch event */
#if CONFIG_DEBUG_TRACE
    uint16_t from_id = 0;
    uint16_t to_id = 0;
    
    if (kcb->task_current && kcb->task_current->data) {
        from_id = ((tcb_t *)kcb->task_current->data)->id;
    }
    
    /* Scheduler selects next task */
    sched_select_next_task();
    
    if (kcb->task_current && kcb->task_current->data) {
        to_id = ((tcb_t *)kcb->task_current->data)->id;
    }
    
    /* Record the context switch */
    debug_trace_event(EVENT_TASK_SWITCH, from_id, to_id);
#endif
    
    /* Continue with context switch */
}

Implementation File Structure

Recommended approach:

  • Create kernel/debug_trace.c for trace implementation
  • Create include/sys/debug_trace.h for API declarations
  • Keep tracing code separate from core kernel

Alternative approach:

  • Place code in kernel/debug.c (if combining with other debug features)
  • Use #ifdef CONFIG_DEBUG_TRACE sections within existing files

Key implementation points:

  1. No locks needed - Recording is atomic (single write operation)
  2. Minimal overhead - Just array write and index increment
  3. Safe in ISR context - No blocking operations
  4. Compile-time disable - Complete removal when not needed
  5. Extensible design - Easy to add new event types

Technical Considerations

Memory Overhead:

  • Ring buffer: 256 events × 16 bytes = 4KB (configurable)
  • Code size: ~1-2KB for tracing logic
  • Can be completely compiled out with CONFIG_DEBUG_TRACE=0

Performance Impact:

  • When disabled: Zero overhead (compiled out)
  • When enabled: ~10-20 CPU cycles per event (array write + index update)
  • No memory allocation or complex operations
  • Safe to use in time-critical paths

Thread Safety:

  • Single-writer design (only kernel records events)
  • No locks required for recording
  • Dump function should be called when system is stable

Compile-Time Configuration:

/* In config.h or Makefile */
#define CONFIG_DEBUG_TRACE 1              /* Enable event tracing */
#define DEBUG_EVENT_BUFFER_SIZE 256       /* Ring buffer size */

Usage Examples

Debugging Scheduler Issues:

/* System behaving strangely, check what happened */
void problematic_scenario(void)
{
    /* ... code that triggers issue ... */
    
    /* Dump recent events to see what led to the problem */
    debug_dump_events();
}

Integration with Exception Handler:

/* In do_trap() - show events before exception */
void do_trap(uint32_t cause, uint32_t epc)
{
    /* ... exception handling ... */
    
    print_exception_context(code, epc);
    print_system_state();
    
    /* Show what happened before the exception */
    printf("\n=== Recent Events Before Exception ===\n");
    debug_dump_events();
    
    hal_panic();
}

Example Output:

=== Event Trace ===
Tick   Event            Task  Param1      Param2
------ ---------------- ----- ----------- -----------
100    TASK_SWITCH      0     0x00000001  0x00000002
101    ISR_ENTER        2     0x00000007  0x00000000
102    ISR_EXIT         2     0x00000007  0x00000000
103    TASK_SWITCH      2     0x00000002  0x00000001
104    SEM_WAIT         1     0x80010000  0x00000000
105    TASK_SWITCH      1     0x00000001  0x00000003
106    PIPE_WRITE       3     0x80010100  0x00000040
107    SEM_POST         3     0x80010000  0x00000000
108    TASK_DELAY       3     0x0000000A  0x00000000
109    TASK_SWITCH      3     0x00000003  0x00000001
110    PIPE_READ        1     0x80010100  0x00000040
111    EXCEPTION        1     0x00000002  0x80001234

Total events: 12

Testing Strategy

Verify tracing system works correctly:

  • Create test application with known event sequence
  • Verify events are recorded in correct order
  • Test ring buffer wraparound (fill buffer completely)
  • Verify event display is readable and accurate
  • Test with CONFIG_DEBUG_TRACE disabled (should compile)

Suggested test approach:

  • Create app/trace_test.c to exercise tracing
  • Generate known sequence of events
  • Dump and manually verify output
  • Test buffer overflow behavior

Future Enhancements

The current design supports future expansion:

Runtime Control:

  • Add enable/disable per event type
  • Add event filtering by mask
  • Dynamic buffer size adjustment

Advanced Features:

  • Event timestamps with higher resolution
  • Per-event-type statistics and counters
  • Export events to external tools
  • Conditional tracing (start/stop on trigger)

Analysis Tools:

  • Event correlation and pattern detection
  • Timing analysis between events
  • Task execution timeline visualization

Additional Event Types:

  • Timer callbacks (start/cancel/fire)
  • Memory allocation/free events
  • Task priority changes
  • Custom user-defined events

These enhancements can be added without changing the core structure.

Related Issues

Part of comprehensive debugging infrastructure:

These features work together to provide full visibility into kernel behavior.

Acceptance Criteria

  • Event types defined for Scheduler, IPC, and ISR subsystems
  • Ring buffer correctly stores and overwrites events
  • debug_trace_event() API is simple and fast
  • debug_dump_events() displays events in readable format
  • Tracing can be completely disabled at compile time
  • Integration points identified in kernel code
  • No performance impact when disabled
  • Documentation for adding new event types

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions