-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Abstract
This RFC proposes options for creating a simulator that demonstrates OpenTelemetry instrumentation patterns for Ollama interactions. The simulator will help developers understand and validate OpenTelemetry semantic conventions for generative AI systems without requiring actual LLM execution. We present three language options - Scheme, Common Lisp, and Python - each with their tradeoffs.
Background
OpenTelemetry's semantic conventions for generative AI define standards for events, metrics, and spans. While Ollama provides a real API for LLM interactions, we need a simulator to:
- Demonstrate correct OpenTelemetry instrumentation patterns
- Test telemetry collection without running actual models
- Validate semantic convention compliance
- Provide a reference implementation for instrumentation
Simulator Options
1. Guile Scheme Simulator
Advantages
- Native support for asynchronous programming through futures and promises
- Strong metaprogramming capabilities via macros
- Seamless integration with Emacs via Geiser
- Clean record type definitions through SRFI-9
- Built-in JSON support
Disadvantages
- Smaller ecosystem for HTTP/networking libraries
- Less common in production environments
- Fewer developers familiar with the language
Example Structure
(define-record-type <ollama-client>
(make-ollama-client base-url)
ollama-client?
(base-url client-base-url))
(define-record-type <span>
(make-span name start-time end-time attributes events)
span?
(name span-name)
(start-time span-start-time)
(end-time span-end-time)
(attributes span-attributes)
(events span-events))2. Common Lisp Implementation
Advantages
- Robust CLOS object system
- Excellent condition system for error handling
- Strong support for multiprocessing via bordeaux-threads
- Rich ecosystem via Quicklisp
- Interactive development with SLIME/SLY
Disadvantages
- More complex setup compared to Python
- Learning curve for CLOS and conditions
- Larger runtime footprint
Example Structure
(defclass ollama-client ()
((base-url :initarg :base-url
:reader base-url)))
(defclass span ()
((name :initarg :name :reader span-name)
(start-time :initarg :start-time :reader start-time)
(end-time :initarg :end-time :reader end-time)
(attributes :initarg :attributes :reader attributes)
(events :initarg :events :reader events)))3. Python Implementation
Advantages
- Rich ecosystem for HTTP clients and async programming
- Widespread familiarity and adoption
- Strong typing support via type hints
- Simple integration with existing OpenTelemetry libraries
- Easy deployment and packaging
Disadvantages
- Less powerful metaprogramming capabilities
- Requires managing virtual environments
- GIL limitations for true parallelism
Example Structure
import os
import logging
from dataclasses import dataclass, field
from typing import Dict, List, Any
import requests
import urllib3
# Set up debug logging
logging.basicConfig(level=logging.DEBUG)
requests_log = logging.getLogger("urllib3")
@dataclass
class Span:
name: str
start_time: int
end_time: int
attributes: Dict[str, Any]
events: List[Event] = field(default_factory=list)
class OllamaTelemetrySimulator:
def __init__(self):
"""Initialize simulator for OpenTelemetry demonstration"""
self.simulated_latency = 0.1 # secondsDevelopment Environment Options
1. Org-mode with Babel
- Supports all three languages
- Literate programming approach
- Interactive development
- Easy documentation
- Can tangle to source files
Example header:
#+PROPERTY: header-args:python :session ollama-demo :results output
#+PROPERTY: header-args:scheme :session ollama-demo :results output
#+PROPERTY: header-args:lisp :session ollama-demo :results output
2. Traditional Editor/IDE
- VSCode with language extensions
- Emacs with language-specific modes
- PyCharm/IntelliJ for Python
- Separate source files
- Traditional debugging tools
Recommendation
All three languages are viable options. The choice should be based on:
-
Primary Use Case:
- Interactive development/research: Scheme or Common Lisp with Org-mode
- Production deployment: Python
- Educational purposes: Any of the three
-
Team Experience:
- Lisp experience: Scheme or Common Lisp
- Python experience: Python
- Learning focus: Any (with Org-mode for documentation)
-
Integration Requirements:
- Existing OpenTelemetry setup: Python
- Emacs ecosystem: Scheme
- Complex error handling: Common Lisp
Proxy and Debugging Considerations
All implementations must support:
- HTTP_PROXY and HTTPS_PROXY environment variables
- Proxy authentication if required
- Debug logging of HTTP requests/responses
- SSL/TLS verification configuration
- Local proxy tools (e.g., mitmproxy, Charles Proxy)
Language-specific Proxy Handling
Scheme Implementation
(define (make-ollama-client #:key (host "localhost") (port 11434))
(let ((http-proxy (getenv "HTTP_PROXY"))
(https-proxy (getenv "HTTPS_PROXY")))
(make-ollama-client-internal
#:base-url (string-append "http://" host ":" (number->string port) "/api")
#:http-proxy http-proxy
#:https-proxy https-proxy)))Common Lisp Implementation
(defclass ollama-client ()
((base-url :initarg :base-url :reader base-url)
(http-proxy :initform (uiop:getenv "HTTP_PROXY") :reader http-proxy)
(https-proxy :initform (uiop:getenv "HTTPS_PROXY") :reader https-proxy)))Python Implementation
class OllamaClient:
def __init__(self, host="localhost", port=11434):
self.base_url = f"http://{host}:{port}/api"
# Requests will automatically use HTTP(S)_PROXY env vars
self.session = requests.Session()
# Optional proxy configuration
if os.environ.get('HTTP_PROXY'):
self.session.proxies['http'] = os.environ['HTTP_PROXY']
if os.environ.get('HTTPS_PROXY'):
self.session.proxies['https'] = os.environ['HTTPS_PROXY']
# Debug logging
if os.environ.get('DEBUG'):
import http.client
http.client.HTTPConnection.debuglevel = 1Debug Logging Format
All implementations should log:
- Full HTTP request headers and body
- Response headers and streaming chunks
- Timing information
- Proxy connection details
- TLS/SSL handshake information (when relevant)
Example debug output:
-> POST http://localhost:11434/api/generate
-> Proxy: http://localhost:8080
-> Headers: {...}
-> Body: {"model": "codellama", "prompt": "..."}
<- Status: 200 OK
<- Headers: {...}
<- Streaming chunks: [...]
<- Total time: 2.3s
Security Considerations
All implementations must handle:
- Secure HTTP connections
- Input validation
- Error handling for streaming responses
- Resource cleanup
- Token/credential management
Performance Considerations
All implementations should:
- Use streaming responses effectively
- Minimize memory usage for large responses
- Handle concurrent requests if needed
- Implement proper cleanup of resources
Testing Strategy
All implementations should include:
- Unit tests for telemetry data structures
- Integration tests with Ollama
- Response streaming tests
- Error handling tests
- Performance benchmarks
Migration Strategy
The implementation should be structured to allow:
- Easy switching between languages
- Consistent API across implementations
- Clear upgrade path for OpenTelemetry changes
- Documentation of differences between implementations
References
- OpenTelemetry Semantic Conventions for GenAI
- Ollama API Documentation
- Language-specific OpenTelemetry SDKs
- Org-mode Babel Documentation
Next Steps
- Choose implementation language based on requirements
- Set up development environment
- Implement core functionality
- Add OpenTelemetry instrumentation
- Create documentation and examples
- Add tests
- Create deployment/packaging strategy