|
| 1 | +# Cache Analysis Research Assistant |
| 2 | + |
| 3 | +This sample demonstrates ADK context caching features with a comprehensive research assistant agent designed to test both Gemini 2.0 Flash and 2.5 Flash context caching capabilities. The sample showcases the difference between explicit ADK caching and Google's built-in implicit caching. |
| 4 | + |
| 5 | +## Key Features |
| 6 | + |
| 7 | +- **App-Level Cache Configuration**: Context cache settings applied at the App level |
| 8 | +- **Large Context Instructions**: Over 4200 tokens in system instructions to trigger context caching thresholds |
| 9 | +- **Comprehensive Tool Suite**: 7 specialized research and analysis tools |
| 10 | +- **Multi-Model Support**: Compatible with any Gemini model, automatically adapts experiment type |
| 11 | +- **Performance Metrics**: Detailed token usage tracking including `cached_content_token_count` |
| 12 | + |
| 13 | +## Cache Configuration |
| 14 | + |
| 15 | +```python |
| 16 | +ContextCacheConfig( |
| 17 | + min_tokens=4096, |
| 18 | + ttl_seconds=600, # 10 mins for research sessions |
| 19 | + cache_intervals=3, # Maximum invocations before cache invalidation |
| 20 | +``` |
| 21 | + |
| 22 | +## Usage |
| 23 | + |
| 24 | +### Run Cache Experiments |
| 25 | + |
| 26 | +The `run_cache_experiments.py` script compares caching performance between models: |
| 27 | + |
| 28 | +```bash |
| 29 | +# Test any Gemini model - script automatically determines experiment type |
| 30 | +python run_cache_experiments.py <model_name> --output results.json |
| 31 | + |
| 32 | +# Examples: |
| 33 | +python run_cache_experiments.py gemini-2.0-flash-001 --output gemini_2_0_results.json |
| 34 | +python run_cache_experiments.py gemini-2.5-flash --output gemini_2_5_results.json |
| 35 | +python run_cache_experiments.py gemini-1.5-flash --output gemini_1_5_results.json |
| 36 | + |
| 37 | +# Run multiple iterations for averaged results |
| 38 | +python run_cache_experiments.py <model_name> --repeat 3 --output averaged_results.json |
| 39 | +``` |
| 40 | + |
| 41 | +### Direct Agent Usage |
| 42 | + |
| 43 | +```bash |
| 44 | +# Run the agent directly |
| 45 | +adk run contributing/samples/cache_analysis/agent.py |
| 46 | + |
| 47 | +# Web interface for debugging |
| 48 | +adk web contributing/samples/cache_analysis |
| 49 | +``` |
| 50 | + |
| 51 | +## Experiment Types |
| 52 | + |
| 53 | +The script automatically determines the experiment type based on the model name: |
| 54 | + |
| 55 | +### Models with "2.5" (e.g., gemini-2.5-flash) |
| 56 | +- **Explicit Caching**: ADK explicit caching + Google's implicit caching |
| 57 | +- **Implicit Only**: Google's built-in implicit caching alone |
| 58 | +- **Measures**: Added benefit of explicit caching over Google's built-in implicit caching |
| 59 | + |
| 60 | +### Other Models (e.g., gemini-2.0-flash-001, gemini-1.5-flash) |
| 61 | +- **Cached**: ADK explicit context caching enabled |
| 62 | +- **Uncached**: No caching (baseline comparison) |
| 63 | +- **Measures**: Raw performance improvement from explicit caching vs no caching |
| 64 | + |
| 65 | +## Tools Included |
| 66 | + |
| 67 | +1. **analyze_data_patterns** - Statistical analysis and pattern recognition in datasets |
| 68 | +2. **research_literature** - Academic and professional literature research with citations |
| 69 | +3. **generate_test_scenarios** - Comprehensive test case generation and validation strategies |
| 70 | +4. **benchmark_performance** - System performance measurement and bottleneck analysis |
| 71 | +5. **optimize_system_performance** - Performance optimization recommendations and strategies |
| 72 | +6. **analyze_security_vulnerabilities** - Security risk assessment and vulnerability analysis |
| 73 | +7. **design_scalability_architecture** - Scalable system architecture design and planning |
| 74 | + |
| 75 | +## Expected Results |
| 76 | + |
| 77 | +### Performance vs Cost Trade-offs |
| 78 | + |
| 79 | +**Note**: This sample uses a tool-heavy agent that may show different performance characteristics than simple text-based agents. |
| 80 | + |
| 81 | +### Performance Improvements |
| 82 | +- **Simple Text Agents**: Typically see 30-70% latency reduction with caching |
| 83 | +- **Tool-Heavy Agents**: May experience higher latency due to cache setup overhead, but still provide cost benefits |
| 84 | +- **Gemini 2.5 Flash**: Compares explicit ADK caching against Google's built-in implicit caching |
| 85 | + |
| 86 | +### Cost Savings |
| 87 | +- **Input Token Cost**: 75% reduction for cached content (25% of normal cost) |
| 88 | +- **Typical Savings**: 30-60% on input costs for multi-turn conversations |
| 89 | +- **Tool-Heavy Workloads**: Cost savings often outweigh latency trade-offs |
| 90 | + |
| 91 | +### Token Metrics |
| 92 | +- **Cached Content Token Count**: Non-zero values indicating successful cache hits |
| 93 | +- **Cache Hit Ratio**: Proportion of tokens served from cache vs fresh computation |
| 94 | + |
| 95 | +## Troubleshooting |
| 96 | + |
| 97 | +### Zero Cached Tokens |
| 98 | +If `cached_content_token_count` is always 0: |
| 99 | +- Verify model names match exactly (e.g., `gemini-2.0-flash-001`) |
| 100 | +- Check that cache configuration `min_tokens` threshold is met |
| 101 | +- Ensure proper App-based configuration is used |
| 102 | + |
| 103 | +### Session Errors |
| 104 | +If seeing "Session not found" errors: |
| 105 | +- Verify `runner.app_name` is used for session creation |
| 106 | +- Check App vs Agent object usage in InMemoryRunner initialization |
| 107 | + |
| 108 | +## Technical Implementation |
| 109 | + |
| 110 | +This sample demonstrates: |
| 111 | +- **Modern App Architecture**: App-level cache configuration following ADK best practices |
| 112 | +- **Integration Testing**: Comprehensive cache functionality validation |
| 113 | +- **Performance Analysis**: Detailed metrics collection and comparison methodology |
| 114 | +- **Error Handling**: Robust session management and cache invalidation handling |
0 commit comments