Skip to content

Commit f4e1fd9

Browse files
seanzhougooglecopybara-github
authored andcommitted
chore: Add sample agent for content cache and basic profiling
PiperOrigin-RevId: 809166922
1 parent c66245a commit f4e1fd9

File tree

5 files changed

+1960
-0
lines changed

5 files changed

+1960
-0
lines changed
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# Cache Analysis Research Assistant
2+
3+
This sample demonstrates ADK context caching features with a comprehensive research assistant agent designed to test both Gemini 2.0 Flash and 2.5 Flash context caching capabilities. The sample showcases the difference between explicit ADK caching and Google's built-in implicit caching.
4+
5+
## Key Features
6+
7+
- **App-Level Cache Configuration**: Context cache settings applied at the App level
8+
- **Large Context Instructions**: Over 4200 tokens in system instructions to trigger context caching thresholds
9+
- **Comprehensive Tool Suite**: 7 specialized research and analysis tools
10+
- **Multi-Model Support**: Compatible with any Gemini model, automatically adapts experiment type
11+
- **Performance Metrics**: Detailed token usage tracking including `cached_content_token_count`
12+
13+
## Cache Configuration
14+
15+
```python
16+
ContextCacheConfig(
17+
min_tokens=4096,
18+
ttl_seconds=600, # 10 mins for research sessions
19+
cache_intervals=3, # Maximum invocations before cache invalidation
20+
```
21+
22+
## Usage
23+
24+
### Run Cache Experiments
25+
26+
The `run_cache_experiments.py` script compares caching performance between models:
27+
28+
```bash
29+
# Test any Gemini model - script automatically determines experiment type
30+
python run_cache_experiments.py <model_name> --output results.json
31+
32+
# Examples:
33+
python run_cache_experiments.py gemini-2.0-flash-001 --output gemini_2_0_results.json
34+
python run_cache_experiments.py gemini-2.5-flash --output gemini_2_5_results.json
35+
python run_cache_experiments.py gemini-1.5-flash --output gemini_1_5_results.json
36+
37+
# Run multiple iterations for averaged results
38+
python run_cache_experiments.py <model_name> --repeat 3 --output averaged_results.json
39+
```
40+
41+
### Direct Agent Usage
42+
43+
```bash
44+
# Run the agent directly
45+
adk run contributing/samples/cache_analysis/agent.py
46+
47+
# Web interface for debugging
48+
adk web contributing/samples/cache_analysis
49+
```
50+
51+
## Experiment Types
52+
53+
The script automatically determines the experiment type based on the model name:
54+
55+
### Models with "2.5" (e.g., gemini-2.5-flash)
56+
- **Explicit Caching**: ADK explicit caching + Google's implicit caching
57+
- **Implicit Only**: Google's built-in implicit caching alone
58+
- **Measures**: Added benefit of explicit caching over Google's built-in implicit caching
59+
60+
### Other Models (e.g., gemini-2.0-flash-001, gemini-1.5-flash)
61+
- **Cached**: ADK explicit context caching enabled
62+
- **Uncached**: No caching (baseline comparison)
63+
- **Measures**: Raw performance improvement from explicit caching vs no caching
64+
65+
## Tools Included
66+
67+
1. **analyze_data_patterns** - Statistical analysis and pattern recognition in datasets
68+
2. **research_literature** - Academic and professional literature research with citations
69+
3. **generate_test_scenarios** - Comprehensive test case generation and validation strategies
70+
4. **benchmark_performance** - System performance measurement and bottleneck analysis
71+
5. **optimize_system_performance** - Performance optimization recommendations and strategies
72+
6. **analyze_security_vulnerabilities** - Security risk assessment and vulnerability analysis
73+
7. **design_scalability_architecture** - Scalable system architecture design and planning
74+
75+
## Expected Results
76+
77+
### Performance vs Cost Trade-offs
78+
79+
**Note**: This sample uses a tool-heavy agent that may show different performance characteristics than simple text-based agents.
80+
81+
### Performance Improvements
82+
- **Simple Text Agents**: Typically see 30-70% latency reduction with caching
83+
- **Tool-Heavy Agents**: May experience higher latency due to cache setup overhead, but still provide cost benefits
84+
- **Gemini 2.5 Flash**: Compares explicit ADK caching against Google's built-in implicit caching
85+
86+
### Cost Savings
87+
- **Input Token Cost**: 75% reduction for cached content (25% of normal cost)
88+
- **Typical Savings**: 30-60% on input costs for multi-turn conversations
89+
- **Tool-Heavy Workloads**: Cost savings often outweigh latency trade-offs
90+
91+
### Token Metrics
92+
- **Cached Content Token Count**: Non-zero values indicating successful cache hits
93+
- **Cache Hit Ratio**: Proportion of tokens served from cache vs fresh computation
94+
95+
## Troubleshooting
96+
97+
### Zero Cached Tokens
98+
If `cached_content_token_count` is always 0:
99+
- Verify model names match exactly (e.g., `gemini-2.0-flash-001`)
100+
- Check that cache configuration `min_tokens` threshold is met
101+
- Ensure proper App-based configuration is used
102+
103+
### Session Errors
104+
If seeing "Session not found" errors:
105+
- Verify `runner.app_name` is used for session creation
106+
- Check App vs Agent object usage in InMemoryRunner initialization
107+
108+
## Technical Implementation
109+
110+
This sample demonstrates:
111+
- **Modern App Architecture**: App-level cache configuration following ADK best practices
112+
- **Integration Testing**: Comprehensive cache functionality validation
113+
- **Performance Analysis**: Detailed metrics collection and comparison methodology
114+
- **Error Handling**: Robust session management and cache invalidation handling
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Copyright 2025 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from . import agent
16+
17+
__all__ = ['agent']

0 commit comments

Comments
 (0)