Skip to content

Latest commit

 

History

History
87 lines (70 loc) · 3.18 KB

File metadata and controls

87 lines (70 loc) · 3.18 KB

LLM Request Data Parsing Implementation Summary

Problem Statement

The audit commands implementation was using placeholder data ("Placeholder prompt", "placeholder-model") instead of parsing actual LLM request data from HTTP request bodies. This violated event sourcing principles where events must record what actually happened.

Solution Overview

Implemented a comprehensive solution to parse real LLM request data from various provider formats and integrate it into the event sourcing system.

Components Implemented

1. LLM Request Parser (llm_request_parser.rs)

  • Parses LLM requests from multiple provider formats:
    • OpenAI format (GPT models) - /v1/chat/completions, /v1/completions
    • Anthropic format (Claude models) - /v1/messages
    • AWS Bedrock format - /bedrock/*, /invoke
  • Extracts common elements: model version, prompt, and parameters
  • Provides graceful fallback for unknown formats
  • Includes comprehensive error handling

2. Audit Buffer Manager (audit_buffer.rs)

  • Manages buffering of request/response body chunks
  • Reconstructs complete bodies from multiple chunks
  • Handles out-of-order chunk delivery
  • Provides cleanup functionality for completed requests

3. Enhanced Audit Commands (audit_commands.rs)

  • Updated RecordRequestReceived to accept parsed LLM data
  • Added ProcessRequestBody command for handling complete request bodies
  • Integrated parser to extract real data instead of placeholders
  • Maintains backward compatibility with existing event flow

4. Integration Guide (audit_integration.md)

  • Comprehensive documentation for integrating the new functionality
  • Example code for audit path handlers
  • Two integration approaches: buffered and immediate

Key Features

Type Safety

  • All parsing operations use validated domain types
  • No primitive obsession - proper newtypes throughout
  • Compile-time guarantees via Rust's type system

Error Handling

  • Graceful fallback for unknown request formats
  • Detailed error messages for debugging
  • System continues operating even with parsing failures

Event Sourcing Compliance

  • Events now record actual data, not placeholders
  • Maintains immutability of events
  • Preserves full audit trail of what happened

Extensibility

  • Easy to add new provider formats
  • Parser auto-detects format based on URI and content
  • Modular design allows independent testing

Testing

  • Comprehensive unit tests for all components
  • Integration tests for command processing
  • Property-based tests for buffer management
  • All tests passing (361 passed, 0 failed)

Usage Example

// When processing audit events
match &event.event_type {
    AuditEventType::RequestBody { content, .. } => {
        let command = ProcessRequestBody {
            // ... metadata fields
            body: content.clone(),
            timestamp: Timestamp::now(),
        };

        // This will parse the body and emit proper LlmRequestReceived event
        event_store.execute_command(command).await?;
    }
}

Future Improvements

  • Response body parsing (currently marked as TODO)
  • Support for more LLM provider formats
  • Streaming request body parsing
  • Integration with metrics for model version tracking