Performance Optimization Guide

Overview

Agentic-Synth is optimized for high-performance synthetic data generation with the following targets:

Sub-second response times for cached requests
100+ concurrent generations supported
Memory efficient data handling (< 400MB)
50%+ cache hit rate for typical workloads

Performance Targets

Metric	Target	Notes
P99 Latency	< 1000ms	For cached requests < 100ms
Throughput	> 10 req/s	Scales with concurrency
Memory Usage	< 400MB	With 1000-item cache
Cache Hit Rate	> 50%	Depends on workload patterns
Error Rate	< 1%	With retry logic

Optimization Strategies

1. Context Caching

Configuration:

const synth = new AgenticSynth({
  enableCache: true,
  cacheSize: 1000,      // Adjust based on memory
  cacheTTL: 3600000,    // 1 hour in milliseconds
});

Benefits:

Reduces API calls by 50-80%
Sub-100ms latency for cache hits
Automatic LRU eviction

Best Practices:

Use consistent prompts for better cache hits
Increase cache size for repetitive workloads
Monitor cache hit rate with synth.getMetrics()

2. Model Routing

Configuration:

const synth = new AgenticSynth({
  modelPreference: [
    'claude-sonnet-4-5-20250929',
    'claude-3-5-sonnet-20241022'
  ],
});

Features:

Automatic load balancing
Performance-based routing
Error handling and fallback

3. Concurrent Generation

Configuration:

const synth = new AgenticSynth({
  maxConcurrency: 100,  // Adjust based on API limits
});

Usage:

const prompts = [...]; // 100+ prompts
const results = await synth.generateBatch(prompts, {
  maxTokens: 500
});

Performance:

2-3x faster than sequential
Respects concurrency limits
Automatic batching

4. Memory Management

Configuration:

const synth = new AgenticSynth({
  memoryLimit: 512 * 1024 * 1024,  // 512MB
});

Features:

Automatic memory tracking
LRU eviction when over limit
Periodic cleanup with synth.optimize()

5. Streaming for Large Outputs

Usage:

const stream = synth.generateStream(prompt, {
  maxTokens: 4096
});

for await (const chunk of stream) {
  // Process chunk immediately
  processChunk(chunk);
}

Benefits:

Lower time-to-first-byte
Reduced memory usage
Better user experience

Benchmarking

Running Benchmarks

# Run all benchmarks
npm run benchmark

# Run specific suite
npm run benchmark -- --suite "Throughput Test"

# With custom settings
npm run benchmark -- --iterations 20 --concurrency 200

# Generate report
npm run benchmark -- --output benchmarks/report.md

Benchmark Suites

Throughput Test: Measures requests per second
Latency Test: Measures P50/P95/P99 latencies
Memory Test: Measures memory usage and leaks
Cache Test: Measures cache effectiveness
Concurrency Test: Tests concurrent request handling
Streaming Test: Measures streaming performance

Analyzing Results

# Analyze performance
npm run perf:analyze

# Generate detailed report
npm run perf:report

Bottleneck Detection

The built-in bottleneck analyzer automatically detects:

1. Latency Bottlenecks

Cause: Slow API responses, network issues
Solution: Increase cache size, optimize prompts
Impact: 30-50% latency reduction

2. Throughput Bottlenecks

Cause: Low concurrency, sequential processing
Solution: Increase maxConcurrency, use batch API
Impact: 2-3x throughput increase

3. Memory Bottlenecks

Cause: Large cache, memory leaks
Solution: Reduce cache size, call optimize()
Impact: 40-60% memory reduction

4. Cache Bottlenecks

Cause: Low hit rate, small cache
Solution: Increase cache size, optimize keys
Impact: 20-40% cache improvement

CI/CD Integration

Performance Regression Detection

# Run in CI
npm run benchmark:ci

Features:

Automatic threshold checking
Fails build on regression
Generates reports for artifacts

GitHub Actions Example

- name: Performance Benchmarks
  run: npm run benchmark:ci

- name: Upload Report
  uses: actions/upload-artifact@v3
  with:
    name: performance-report
    path: benchmarks/performance-report.md

Profiling

CPU Profiling

npm run benchmark:profile
node --prof-process isolate-*.log > profile.txt

Memory Profiling

node --expose-gc --max-old-space-size=512 dist/benchmarks/runner.js

Chrome DevTools

node --inspect-brk dist/benchmarks/runner.js
# Open chrome://inspect

Optimization Checklist

Performance Monitoring

Runtime Metrics

// Get current metrics
const metrics = synth.getMetrics();
console.log('Cache:', metrics.cache);
console.log('Memory:', metrics.memory);
console.log('Router:', metrics.router);

Performance Monitor

import { PerformanceMonitor } from '@ruvector/agentic-synth';

const monitor = new PerformanceMonitor();
monitor.start();

// ... run workload ...

const metrics = monitor.getMetrics();
console.log('Throughput:', metrics.throughput);
console.log('P99 Latency:', metrics.p99LatencyMs);

Bottleneck Analysis

import { BottleneckAnalyzer } from '@ruvector/agentic-synth';

const analyzer = new BottleneckAnalyzer();
const report = analyzer.analyze(metrics);

if (report.detected) {
  console.log('Bottlenecks:', report.bottlenecks);
  console.log('Recommendations:', report.recommendations);
}

Best Practices

Cache Strategy: Use prompts as cache keys, normalize formatting
Concurrency: Start with 100, increase based on API limits
Memory: Monitor with getMetrics(), call optimize() periodically
Streaming: Use for outputs > 1000 tokens
Benchmarking: Run before releases, track trends
Monitoring: Enable in production, set up alerts
Optimization: Profile first, optimize bottlenecks
Testing: Include performance tests in CI/CD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Optimization Guide

Overview

Performance Targets

Optimization Strategies

1. Context Caching

2. Model Routing

3. Concurrent Generation

4. Memory Management

5. Streaming for Large Outputs

Benchmarking

Running Benchmarks

Benchmark Suites

Analyzing Results

Bottleneck Detection

1. Latency Bottlenecks

2. Throughput Bottlenecks

3. Memory Bottlenecks

4. Cache Bottlenecks

CI/CD Integration

Performance Regression Detection

GitHub Actions Example

Profiling

CPU Profiling

Memory Profiling

Chrome DevTools

Optimization Checklist

Performance Monitoring

Runtime Metrics

Performance Monitor

Bottleneck Analysis

Best Practices

Troubleshooting

High Latency

Low Throughput

High Memory Usage

Low Cache Hit Rate

Additional Resources

FilesExpand file tree

PERFORMANCE.md

Latest commit

History

PERFORMANCE.md

File metadata and controls

Performance Optimization Guide

Overview

Performance Targets

Optimization Strategies

1. Context Caching

2. Model Routing

3. Concurrent Generation

4. Memory Management

5. Streaming for Large Outputs

Benchmarking

Running Benchmarks

Benchmark Suites

Analyzing Results

Bottleneck Detection

1. Latency Bottlenecks

2. Throughput Bottlenecks

3. Memory Bottlenecks

4. Cache Bottlenecks

CI/CD Integration

Performance Regression Detection

GitHub Actions Example

Profiling

CPU Profiling

Memory Profiling

Chrome DevTools

Optimization Checklist

Performance Monitoring

Runtime Metrics

Performance Monitor

Bottleneck Analysis

Best Practices

Troubleshooting

High Latency

Low Throughput

High Memory Usage

Low Cache Hit Rate

Additional Resources