You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: cline_docs/bedrock-cache-strategy-documentation.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -554,10 +554,11 @@ const config = {
554
554
1. The algorithm detects that all cache points are used and new messages have been added.
555
555
2. It calculates the token count of the new messages (400 tokens).
556
556
3. It analyzes the token distribution between existing cache points and finds the smallest gap (260 tokens).
557
-
4. It compares the token count of new messages (400) with the smallest gap (260).
558
-
5. Since the new messages have more tokens than the smallest gap (400 > 260), it decides to combine cache points.
559
-
6. It identifies that the cache point at index 8 has the smallest token coverage (260 tokens).
560
-
7. It removes this cache point and places a new one after the new user message.
557
+
4. It calculates the required token threshold by applying a 20% increase to the smallest gap (260 \* 1.2 = 312).
558
+
5. It compares the token count of new messages (400) with this threshold (312).
559
+
6. Since the new messages have significantly more tokens than the threshold (400 > 312), it decides to combine cache points.
560
+
7. It identifies that the cache point at index 8 has the smallest token coverage (260 tokens).
561
+
8. It removes this cache point and places a new one after the new user message.
561
562
562
563
**Output Cache Point Placements with Reallocation:**
563
564
@@ -607,14 +608,14 @@ const config = {
607
608
608
609
### Key Observations
609
610
610
-
1.**Simplified Placement Logic**: The algorithm now simply finds the last user message in each range, rather than using complex token midpoint calculations. This makes the code more maintainable while still providing effective cache point placement.
611
+
1.**Simple Initial Placement Logic**: The last user message in the range that meets the minimum token threshold is set as a cachePoint.
611
612
612
613
2.**User Message Boundary Requirement**: Cache points are placed exclusively after user messages, not after assistant messages. This ensures cache points are placed at natural conversation boundaries where the user has provided input.
613
614
614
615
3.**Token Threshold Enforcement**: Each segment between cache points must meet the minimum token threshold (100 tokens in our examples) to be considered for caching. This is enforced by a guard clause that checks if the total tokens covered by a placement meets the minimum threshold.
615
616
616
617
4.**Adaptive Placement for Growing Conversations**: As the conversation grows, the strategy adapts by preserving previous cache points when possible and only reallocating them when beneficial.
617
618
618
-
5.**Token Comparison Optimization**: When all cache points are used and new messages are added, the algorithm compares the token count of new messages with the smallest combined gap between existing cache points. Cache points are only combined if the new messages have more tokens than the smallest gap, ensuring that reallocation is only done when it results in a net positive effect on caching efficiency.
619
+
5.**Token Comparison Optimization with Required Increase**: When all cache points are used and new messages are added, the algorithm compares the token count of new messages with the smallest combined token count of contiguous existing cache points, applying a required percentage increase (20%) to ensure reallocation is worth it. Cache points are only combined if the new messages have significantly more tokens than this threshold, ensuring that reallocation is only done when it results in a substantial net positive effect on caching efficiency.
619
620
620
621
This adaptive approach ensures that as conversations grow, the caching strategy continues to optimize token usage and response times by strategically placing cache points at the most effective positions, while avoiding inefficient reallocations that could result in a net negative effect on caching performance.
0 commit comments